Deep Reinforcement Learning for UAV Routing in the Presence of Multiple Charging Stations

被引:20
作者
Fan, Mingfeng [1 ]
Wu, Yaoxin [2 ]
Liao, Tianjun [3 ]
Cao, Zhiguang [4 ]
Guo, Hongliang [4 ]
Sartoretti, Guillaume [5 ]
Wu, Guohua [1 ]
机构
[1] Cent South Univ, Sch Traff & Transportat Engn, Changsha 410017, Hunan, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[3] Acad Mil Sci Beijing, Beijing 100091, Peoples R China
[4] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
[5] Natl Univ Singapore, Dept Mech Engn, Singapore 119077, Singapore
基金
中国国家自然科学基金;
关键词
Routing; Monitoring; Charging stations; Autonomous aerial vehicles; Reinforcement learning; Vehicle routing; Mathematical programming; Combinatorial optimization problems; deep reinforcement learning; heuristics; UAV routing; UNMANNED AERIAL VEHICLE; OPTIMIZATION; ALGORITHMS; ASSIGNMENT; NETWORKS;
D O I
10.1109/TVT.2022.3232607
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deploying Unmanned Aerial Vehicles (UAVs) for traffic monitoring has been a hotspot given their flexibility and broader view. However, a UAV is usually constrained by battery capacity due to limited payload. On the other hand, the development of wireless charging technology has allowed UAVs to replenish energy from charging stations.In this paper, we study a UAV routing problem in the presence of multiple charging stations (URPMCS) with the objective of minimizing the total distance traveled by the UAV during traffic monitoring. We present a deep reinforcement learning based method, where a multi-head heterogeneous attention mechanism is designed to facilitate learning a policy that automatically and sequentially constructs the route, while taking the energy consumption into account. In our method, two types of attentions are leveraged to learn the relations between monitoring targets and charging station nodes, adopting an encoder-decoder-like policy network. Moreover, we also employ a curriculum learning strategy to enhance generalization to different numbers of charging stations. Computational results show that our method outperforms conventional algorithms with higher solution quality (except for exact methods such as Gurobi) and shorter runtime in general, and also exhibits strong generalized performance on problem instances with different distributions and sizes.
引用
收藏
页码:5732 / 5746
页数:15
相关论文
共 59 条
[1]   Autonomous Recharging and Flight Mission Planning for Battery-Operated Autonomous Drones [J].
Alyassi, Rashid ;
Khonji, Majid ;
Karapetyan, Areg ;
Chau, Sid Chi-Kin ;
Elbassioni, Khaled ;
Tseng, Chien-Ming .
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, 20 (02) :1034-1046
[2]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, 10.48550/arXiv.1409.0473,1409.0473, DOI 10.48550/ARXIV.1409.0473,1409.0473]
[3]  
Bello I., 2017, 5 INT C LEARN REPR I
[4]  
Cao Y., 2022, P INT S DISTR AUT RO
[5]   Column generation for a UAV assignment problem with precedence constraints [J].
Casbeer, David W. ;
Holsapple, Raymond W. .
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2011, 21 (12) :1421-1433
[6]  
Chen XY, 2019, ADV NEUR IN, V32
[7]   A multi-objective green UAV routing problem [J].
Coelho, Bruno N. ;
Coelho, Vitor N. ;
Coelho, Igor M. ;
Ochi, Luiz S. ;
Haghnazar, Roozbeh K. ;
Zuidema, Demetrius ;
Lima, Milton S. F. ;
da Costa, Adilson R. .
COMPUTERS & OPERATIONS RESEARCH, 2017, 88 :306-315
[8]   IMPROVEMENTS AND EXTENSIONS TO THE MILLER-TUCKER-ZEMLIN SUBTOUR ELIMINATION CONSTRAINTS [J].
DESROCHERS, M ;
LAPORTE, G .
OPERATIONS RESEARCH LETTERS, 1991, 10 (01) :27-36
[9]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]   A Green Vehicle Routing Problem [J].
Erdogan, Sevgi ;
Miller-Hooks, Elise .
TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW, 2012, 48 (01) :100-114