Graph Transformer with Reinforcement Learning for Vehicle Routing Problem

被引：8

作者：

Fellek, Getu ^{[1
]}

Farid, Ahmed ^{[1
]}

Gebreyesus, Goytom ^{[1
]}

Fujimura, Shigeru ^{[1
]}

Yoshie, Osamu ^{[1
]}

机构：

[1] Waseda Univ, Grad Sch Informat Prod & Syst, Fukuoka, Japan

来源：

IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING | 2023年 / 18卷 / 05期

关键词：

vehicle routing problem; reinforcement learning; graph transformer; edge information embedded multi-head attention;

D O I：

10.1002/tee.23771

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Vehicle routing problem (VRP) is one of the classic combinatorial optimization problems where an optimal tour to visit customers is required with a minimum total cost in the presence of some constraints. Recently, VRP is being solved with the use of deep reinforcement learning (DRL), with node sets considered (represented) as a graph structure. Existing Transformer based DRL solutions for VRP rely only on node information ignoring the role of information on the edges between nodes in the graph structure. In this paper, we proposed an attention-based end-to-end DRL model to solve VRP which embeds edge information between nodes for rich graph representation learning. We used Transformer based encoder-decoder framework with an edge information embedded multi-head attention (EEMHA) layer in the encoder. The EEMHA-based encoder learns underlying structure of the graph and generates an expressive graph topology representation by merging node and edge information. We trained our model using proximal policy optimization (PPO) with some code-level optimization techniques. We conducted an experiment on randomly generated instances and on a real-world data generated from road networks to verify the performance of our proposed model. The result from all experiments show that our model performs better than the existing DRL methods and most of the conventional heuristics with good generalizability from random instance training to real-world instance testing of different problem size. (c) 2023 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.

引用

页码：701 / 713

页数：13

共 44 条

[1]

Malazgirt GA, 2019, Arxiv, DOI arXiv:1905.05567

[2]

Bello I., 2016, PREPRINT

[3]

Brody S, 2022, Arxiv, DOI [arXiv:2105.14491, DOI 10.48550/ARXIV.2105.14491]

[4] Dynamic Vehicle Routing for Robotic Systems [J].

Bullo, Francesco ;

Frazzoli, Emilio ;

Pavone, Marco ;

Savla, Ketan ;

Smith, Stephen L. .

PROCEEDINGS OF THE IEEE, 2011, 99 (09) :1482-1504

[5] Vehicle routing problems for city logistics [J].

Cattaruzza D. ;

Absi N. ;

Feillet D. ;

González-Feliu J. .

EURO Journal on Transportation and Logistics, 2017, 6 (01) :51-79

[6]

Dai HJ, 2020, Arxiv, DOI [arXiv:1603.05629, DOI 10.48550/ARXIV.1603.05629]

[7]

Dai HJ, 2018, Arxiv, DOI arXiv:1704.01665

[8] Learning Heuristics for the TSP by Policy Gradient [J].

Deudon, Michel ;

Cournut, Pierre ;

Lacoste, Alexandre ;

Adulyasak, Yossiri ;

Rousseau, Louis-Martin .

INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2018, 2018, 10848 :170-181

[9] Vehicle Routing Problems for Drone Delivery [J].

Dorling, Kevin ;

Heinrichs, Jordan ;

Messier, Geoffrey G. ;

Magierowski, Sebastian .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 47 (01) :70-85

[10]

Drori I., 2020, 2020 19 IEEE INT C M, P19

← 1 2 3 4 5 →