Learning Variable Ordering Heuristics with Multi-Armed Bandits and Restarts

被引：6

作者：

Wattez, Hugues ^{[1
,2
]}

Koriche, Frederic ^{[1
,2
]}

Lecoutre, Christophe ^{[1
,2
]}

Paparrizou, Anastasia ^{[1
,2
]}

Tabary, Sebastien ^{[1
,2
]}

机构：

[1] Univ Artois, CRIL, Arras, France

[2] CNRS, Paris, France

来源：

ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年 / 325卷

关键词：

SOLVER; SEARCH;

D O I：

10.3233/FAIA200115

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In constraint-based applications, the user is often required to be an expert as, for a given problem instance, many parameters of the used solver must be manually tuned to improve its efficiency. Clearly, this background knowledge burdens the spread of constraint programming technology to non-expert users. In order to alleviate this issue, the idea of "autonomous" constraint solving is to adjust the solver parameters and to efficiently handle any problem instance without manual tuning. Notably, the choice of the variable ordering heuristic can lead to drastically different performances. A key question arises then: how can we find the best variable ordering heuristic for a problem instance, given a set of available heuristics provided by the solver? To answer this question, we propose an algorithmic framework that combines multi-armed bandits and restarts. Each candidate heuristic is viewed as an arm, and the framework learns to estimate the best heuristic using a multi-armed bandit algorithm. The common mechanism of restarts is used to provide feedback for reinforcing the bandit algorithm. Based on a thorough experimental evaluation, we demonstrate that this framework is able to find the best heuristic for most problem instances; notably, it outperforms the state-of-the-art in terms of time and solved instances.

引用

页码：371 / 378

页数：8

共 50 条

[31] Multi-armed bandits: Theory and applications to online learning in networks
Zhao Q.
Zhao, Qing, 1600, Morgan and Claypool Publishers (12): : 1 - 165
[32] Human-AI Learning Performance in Multi-Armed Bandits
Pandya, Ravi
Huang, Sandy H.
Hadfield-Menell, Dylan
Dragan, Anca D.
AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2019, : 369 - 375
[33] Are Multi-Armed Bandits Susceptible to Peeking?
Loecher, Markus
ZAGREB INTERNATIONAL REVIEW OF ECONOMICS & BUSINESS, 2018, 21 (01): : 95 - 104
[34] Secure Outsourcing of Multi-Armed Bandits
Ciucanu, Radu
Lafourcade, Pascal
Lombard-Platet, Marius
Soare, Marta
2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 202 - 209
[35] Decentralized Exploration in Multi-Armed Bandits
Feraud, Raphael
Alami, Reda
Laroche, Romain
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[36] Multi-armed bandits with episode context
Rosin, Christopher D.
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2011, 61 (03) : 203 - 230
[37] Introduction to Multi-Armed Bandits Preface
Slivkins, Aleksandrs
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (1-2): : 1 - 286
[38] PAC-Bayesian lifelong learning for multi-armed bandits
Hamish Flynn
David Reeb
Melih Kandemir
Jan Peters
Data Mining and Knowledge Discovery, 2022, 36 : 841 - 876
[39] Optimal Learning Policies for Differential Privacy in Multi-armed Bandits
Wang, Siwei
Zhu, Jun
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[40] Federated Multi-armed Bandits with Personalization
Shi, Chengshuai
Shen, Cong
Yang, Jing
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130

← 1 2 3 4 5 →