Generative Adversarial Imitation Learning to Search in Branch-and-Bound Algorithms

被引:2
作者
Wang, Qi [1 ]
Blackley, Suzanne, V [2 ]
Tang, Chunlei [2 ]
机构
[1] Fudan Univ, Shanghai 200438, Peoples R China
[2] Harvard Med Sch, Boston, MA 02120 USA
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT II | 2022年
关键词
Combinatorial optimization; Reinforcement learning; Branch-and-bound; Generative adversarial imitation learning;
D O I
10.1007/978-3-031-00126-0_51
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent studies have shown that reinforcement learning (RL) can provide state-of-the-art performance at learning sophisticated heuristics by exploiting the shared internal structure combinatorial optimization instances in the data. However, existing RL-based methods require too much trial-and-error reliant on sophisticated reward engineering, which is laborious and inefficient for practical applications. This paper proposes a novel framework (RAIL) that combines RL and generative adversarial imitation learning (GAIL) to meet the challenge by searching in branch-and-bound algorithms. RAIL has a policy architecture with dual decoders, corresponding to the sequence decoding of RL and the edge decoding of GAIL, respectively. The two complement each other and restrict each other to improve the learned policy and reward function iteratively.
引用
收藏
页码:673 / 680
页数:8
相关论文
共 50 条
[31]   An Enhanced Driving Trajectory Prediction Method Based on Generative Adversarial Imitation Learning [J].
Liu, Ming ;
Lin, Fanrong ;
Zhang, Zhen ;
Jia, Yungang ;
Cui, Jianming .
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT V, ICIC 2024, 2024, 14879 :179-190
[32]   GAILPG: Multiagent Policy Gradient With Generative Adversarial Imitation Learning [J].
Li, Wei ;
Huang, Shiyi ;
Qiu, Ziming ;
Song, Aiguo .
IEEE TRANSACTIONS ON GAMES, 2025, 17 (01) :62-75
[33]   Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning [J].
Zhou, Yirui ;
Zhang, Yangchun ;
Liu, Xiaowei ;
Wang, Wanying ;
Che, Zhengping ;
Xu, Zhiyuan ;
Tang, Jian ;
Peng, Yaxin .
PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVII, PPSN 2022, PT I, 2022, 13398 :385-399
[34]   Distributional generative adversarial imitation learning with reproducing kernel generalization [J].
Zhou, Yirui ;
Lu, Mengxiao ;
Liu, Xiaowei ;
Che, Zhengping ;
Xu, Zhiyuan ;
Tang, Jian ;
Zhang, Yangchun ;
Peng, Yan ;
Peng, Yaxin .
NEURAL NETWORKS, 2023, 165 :43-59
[35]   Improving local search for the weighted sum coloring problem using the branch-and-bound algorithm [J].
Niu, Dangdang ;
Liu, Bin ;
Zhang, Hongming ;
Yin, Minghao .
KNOWLEDGE-BASED SYSTEMS, 2022, 246
[36]   On minimization of the number of branches in branch-and-bound algorithms for the maximum clique problem [J].
Li, Chu-Min ;
Jiang, Hua ;
Manya, Felip .
COMPUTERS & OPERATIONS RESEARCH, 2017, 84 :1-15
[37]   Accelerating convergence of branch-and-bound algorithms for quadratically constrained optimization problems [J].
VanVoorhis, T ;
AlKhayyal, F .
STATE OF THE ART IN GLOBAL OPTIMIZATION: COMPUTATIONAL METHODS AND APPLICATIONS, 1996, 7 :267-284
[38]   Branch-and-bound algorithms: A survey of recent advances in searching, branching, and pruning [J].
Morrison, David R. ;
Jacobson, Sheldon H. ;
Sauppe, Jason J. ;
Sewell, Edward C. .
DISCRETE OPTIMIZATION, 2016, 19 :79-102
[39]   Convergence-order analysis of branch-and-bound algorithms for constrained problems [J].
Kannan, Rohit ;
Barton, Paul I. .
JOURNAL OF GLOBAL OPTIMIZATION, 2018, 71 (04) :753-813
[40]   Branch-and-bound and PSO algorithms for no-wait job shop scheduling [J].
AitZai, Abdelhakim ;
Benmedjdoub, Brahim ;
Boudhar, Mourad .
JOURNAL OF INTELLIGENT MANUFACTURING, 2016, 27 (03) :679-688