Trajectory Planning With Deep Reinforcement Learning in High-Level Action Spaces

被引:12
作者
Williams, Kyle R. [1 ]
Schlossman, Rachel [1 ]
Whitten, Daniel [1 ]
Ingram, Joe
Musuvathy, Srideep [1 ]
Pagan, James [1 ]
Williams, Kyle A. [1 ]
Green, Sam [2 ]
Patel, Anirudh [2 ]
Mazumdar, Anirban [3 ]
Parish, Julie [1 ]
机构
[1] Sandia Natl Labs, Albuquerque, CA 94551 USA
[2] Semiot Labs, Los Altos, CA 94022 USA
[3] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
Trajectory; Planning; Trajectory planning; Training; Reinforcement learning; Optimization; Aerodynamics; OPTIMIZATION;
D O I
10.1109/TAES.2022.3218496
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This article presents a technique for trajectory planning based on parameterized high-level actions. These high-level actions are subtrajectories that have variable shape and duration. The use of high-level actions can improve the performance of guidance algorithms. Specifically, we show how the use of high-level actions improves the performance of guidance policies that are generated via reinforcement learning (RL). RL has shown great promise for solving complex control, guidance, and coordination problems but can still suffer from long training times and poor performance. This work shows how the use of high-level actions reduces the required number of training steps and increases the path performance of an RL-trained guidance policy. We demonstrate the method on a space-shuttle guidance example. We show the proposed method increases the path performance (latitude range) by 18% compared with a baseline RL implementation. Similarly, we show the proposed method achieves steady state during training with approximately 75% fewer training steps. We also show how the guidance policy enables effective performance in an obstacle field. Finally, this article develops a loss function term for policy-gradient-based deep RL, which is analogous to an antiwindup mechanism in feedback control. We demonstrate that the inclusion of this term in the underlying optimization increases the average policy return in our numerical example.
引用
收藏
页码:2513 / 2529
页数:17
相关论文
共 50 条
[41]   Strategic Workforce Planning with Deep Reinforcement Learning [J].
Smit, Yannick ;
Den Hengst, Floris ;
Bhulai, Sandjai ;
Mehdad, Ehsan .
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2022, PT II, 2023, 13811 :108-122
[42]   Deep Reinforcement Learning for Joint Trajectory Planning, Transmission Scheduling, and Access Control in UAV-Assisted Wireless Sensor Networks [J].
Luo, Xiaoling ;
Chen, Che ;
Zeng, Chunnian ;
Li, Chengtao ;
Xu, Jing ;
Gong, Shimin .
SENSORS, 2023, 23 (10)
[43]   Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment [J].
Chai, Runqi ;
Niu, Hanlin ;
Carrasco, Joaquin ;
Arvin, Farshad ;
Yin, Hujun ;
Lennox, Barry .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) :5778-5792
[44]   Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-Assisted Mobile Edge Computing [J].
Wang, Liang ;
Wang, Kezhi ;
Pan, Cunhua ;
Xu, Wei ;
Aslam, Nauman ;
Nallanathan, Arumugam .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (10) :3536-3550
[45]   Trajectory and Communication Design for Cache- Enabled UAVs in Cellular Networks: A Deep Reinforcement Learning Approach [J].
Ji, Jiequ ;
Zhu, Kun ;
Cai, Lin .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) :6190-6204
[46]   Deep Reinforcement Learning Multi-UAV Trajectory Control for Target Tracking [J].
Moon, Jiseon ;
Papaioannou, Savvas ;
Laoudias, Christos ;
Kolios, Panayiotis ;
Kim, Sunwoo .
IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (20) :15441-15455
[47]   Deep Reinforcement Learning Enables Joint Trajectory and Communication in Internet of Robotic Things [J].
Luo, Ruyu ;
Tian, Hui ;
Ni, Wanli ;
Cheng, Julian ;
Chen, Kwang-Cheng .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (12) :18154-18168
[48]   UAV Trajectory Planning Based on Deep Residual Learning Network Optimization [J].
Liu, Yinghuang ;
Lu, Zhi ;
Tian, Xizhe ;
Hou, Rui .
TRENDS IN ADVANCED UNMANNED AERIAL SYSTEMS, ICAUAS 2024, 2025, :52-58
[49]   Information-Entropy-Based Trajectory Planning for AUV-Aided Network Localization: A Reinforcement Learning Approach [J].
Huang, Peishuo ;
Li, Yichen ;
Wang, Yiyin ;
Guan, Xinping .
IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (02) :2122-2134
[50]   High-Level Behavior Control of an E-Pet with Reinforcement Learning [J].
Hsu, Chih-Wei ;
Liu, Alan .
2010 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,