Trajectory Planning With Deep Reinforcement Learning in High-Level Action Spaces

被引:7
作者
Williams, Kyle R. [1 ]
Schlossman, Rachel [1 ]
Whitten, Daniel [1 ]
Ingram, Joe
Musuvathy, Srideep [1 ]
Pagan, James [1 ]
Williams, Kyle A. [1 ]
Green, Sam [2 ]
Patel, Anirudh [2 ]
Mazumdar, Anirban [3 ]
Parish, Julie [1 ]
机构
[1] Sandia Natl Labs, Albuquerque, CA 94551 USA
[2] Semiot Labs, Los Altos, CA 94022 USA
[3] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
Trajectory; Planning; Trajectory planning; Training; Reinforcement learning; Optimization; Aerodynamics; OPTIMIZATION;
D O I
10.1109/TAES.2022.3218496
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This article presents a technique for trajectory planning based on parameterized high-level actions. These high-level actions are subtrajectories that have variable shape and duration. The use of high-level actions can improve the performance of guidance algorithms. Specifically, we show how the use of high-level actions improves the performance of guidance policies that are generated via reinforcement learning (RL). RL has shown great promise for solving complex control, guidance, and coordination problems but can still suffer from long training times and poor performance. This work shows how the use of high-level actions reduces the required number of training steps and increases the path performance of an RL-trained guidance policy. We demonstrate the method on a space-shuttle guidance example. We show the proposed method increases the path performance (latitude range) by 18% compared with a baseline RL implementation. Similarly, we show the proposed method achieves steady state during training with approximately 75% fewer training steps. We also show how the guidance policy enables effective performance in an obstacle field. Finally, this article develops a loss function term for policy-gradient-based deep RL, which is analogous to an antiwindup mechanism in feedback control. We demonstrate that the inclusion of this term in the underlying optimization increases the average policy return in our numerical example.
引用
收藏
页码:2513 / 2529
页数:17
相关论文
共 50 条
  • [1] Combining Decision Making and Trajectory Planning for Lane Changing Using Deep Reinforcement Learning
    Li, Shurong
    Wei, Chong
    Wang, Ying
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (09) : 16110 - 16136
  • [2] Stratospheric airship trajectory planning in wind field using deep reinforcement learning
    Qi, Lele
    Yang, Xixiang
    Bai, Fangchao
    Deng, Xiaolong
    Pan, Yuelong
    ADVANCES IN SPACE RESEARCH, 2025, 75 (01) : 620 - 634
  • [3] Comfort-Oriented Motion Planning for Automated Vehicles Using Deep Reinforcement Learning
    Rajesh, Nishant
    Zheng, Yanggu
    Shyrokau, Barys
    IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 4 : 348 - 359
  • [4] Deep Reinforcement Learning for Trajectory Path Planning and Distributed Inference in Resource-Constrained UAV Swarms
    Dhuheir, Marwan
    Baccour, Emna
    Erbad, Aiman
    Al-Obaidi, Sinan Sabeeh
    Hamdi, Mounir
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (09) : 8185 - 8201
  • [5] Deep Reinforcement Learning Based Trajectory Planning Under Uncertain Constraints
    Chen, Lienhung
    Jiang, Zhongliang
    Cheng, Long
    Knoll, Alois C.
    Zhou, Mingchuan
    FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [6] 3-D Autonomous Entry Trajectory Planning via Hybrid Action Reinforcement Learning
    Peng, Gaoxiang
    Wang, Bo
    Liu, Lei
    Fan, Huijin
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2025, 61 (01) : 342 - 354
  • [7] Reinforcement Learning-Based Collision Avoidance and Optimal Trajectory Planning in UAV Communication Networks
    Hsu, Yu-Hsin
    Gau, Rung-Hung
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (01) : 306 - 320
  • [8] Evaluation of action spaces for Reinforcement Learning in optical design
    Fu, Cailing
    Onyszkiewicz, Dominik
    Kemmerling, Marco
    Stollenwerk, Jochen
    Holly, Carlo
    MACHINE LEARNING IN PHOTONICS, 2024, 13017
  • [9] Trajectory Planning for Hypersonic Vehicles with Reinforcement Learning
    Chi, Haihong
    Thou, Mingxin
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 3721 - 3726
  • [10] Multi-UAV Adaptive Cooperative Formation Trajectory Planning Based on an Improved MATD3 Algorithm of Deep Reinforcement Learning
    Xing, Xiaojun
    Zhou, Zhiwei
    Li, Yan
    Xiao, Bing
    Xun, Yilin
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) : 12484 - 12499