A deep reinforcement learning approach for solving the Traveling Salesman Problem with Drone

被引:38
作者
Bogyrbayeva, Aigerim [1 ]
Yoon, Taehyun [2 ]
Ko, Hanbum [2 ]
Lim, Sungbin [3 ]
Yun, Hyokun [4 ]
Kwon, Changhyun [5 ]
机构
[1] Suleyman Demirel Univ, Kaskelen, Kazakhstan
[2] UNIST, Ulsan, South Korea
[3] Korea Univ, Seoul, South Korea
[4] Amazon, Seattle, WA USA
[5] Univ S Florida, Tampa, FL 33620 USA
基金
新加坡国家研究基金会; 美国国家科学基金会;
关键词
Vehicle routing; Traveling salesman problem; Drones; Reinforcement learning; Neural networks; NEIGHBORHOOD SEARCH; OPTIMIZATION; LOGISTICS; TRUCK;
D O I
10.1016/j.trc.2022.103981
中图分类号
U [交通运输];
学科分类号
08 ; 0823 ;
摘要
Reinforcement learning has recently shown promise in learning quality solutions in many combinatorial optimization problems. In particular, the attention-based encoder-decoder models show high effectiveness on various routing problems, including the Traveling Salesman Problem (TSP). Unfortunately, they perform poorly for the TSP with Drone (TSP-D), requiring routing a heterogeneous fleet of vehicles in coordination-a truck and a drone. In TSP-D, the two vehicles are moving in tandem and may need to wait at a node for the other vehicle to join. State-less attention-based decoder fails to make such coordination between vehicles. We propose a hybrid model that uses an attention encoder and a Long Short-Term Memory (LSTM) network decoder, in which the decoder's hidden state can represent the sequence of actions made. We empirically demonstrate that such a hybrid model improves upon a purely attention-based model for both solution quality and computational efficiency. Our experiments on the min-max Capacitated Vehicle Routing Problem (mmCVRP) also confirm that the hybrid model is more suitable for the coordinated routing of multiple vehicles than the attention-based model. The proposed model demonstrates comparable results as the operations research baseline methods.
引用
收藏
页数:19
相关论文
共 50 条
[1]   Optimization Approaches for the Traveling Salesman Problem with Drone [J].
Agatz, Niels ;
Bouman, Paul ;
Schmidt, Marie .
TRANSPORTATION SCIENCE, 2018, 52 (04) :965-981
[2]  
Amazon Prime Air, 2022, AMAZON PRIME AIR
[3]  
Applegate D., 2001, Computational Combinatorial Optimization, V2241, P261, DOI DOI 10.1007/3-540-45586-87
[4]  
Bello I., 2016, Neural combinatorial optimization with reinforcement learning, DOI 10.48550/arXiv.1611.09940
[5]   Julia: A Fresh Approach to Numerical Computing [J].
Bezanson, Jeff ;
Edelman, Alan ;
Karpinski, Stefan ;
Shah, Viral B. .
SIAM REVIEW, 2017, 59 (01) :65-98
[6]   A column-and-row generation approach for the flying sidekick travelling salesman problem [J].
Boccia, Maurizio ;
Masone, Adriano ;
Sforza, Antonio ;
Sterle, Claudio .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 124
[7]  
Bogyrbayeva A., 2022, arXiv
[8]   A Reinforcement Learning Approach for Rebalancing Electric Vehicle Sharing Systems [J].
Bogyrbayeva, Aigerim ;
Jang, Sungwook ;
Shah, Ankit ;
Jang, Young Jae ;
Kwon, Changhyun .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (07) :8704-8714
[9]   Dynamic programming approaches for the traveling salesman problem with drone [J].
Bouman, Paul ;
Agatz, Niels ;
Schmidt, Marie .
NETWORKS, 2018, 72 (04) :528-542
[10]  
Business Insider, 2017, Business Insider