Trajectory Planning With Deep Reinforcement Learning in High-Level Action Spaces

被引:7
作者
Williams, Kyle R. [1 ]
Schlossman, Rachel [1 ]
Whitten, Daniel [1 ]
Ingram, Joe
Musuvathy, Srideep [1 ]
Pagan, James [1 ]
Williams, Kyle A. [1 ]
Green, Sam [2 ]
Patel, Anirudh [2 ]
Mazumdar, Anirban [3 ]
Parish, Julie [1 ]
机构
[1] Sandia Natl Labs, Albuquerque, CA 94551 USA
[2] Semiot Labs, Los Altos, CA 94022 USA
[3] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
Trajectory; Planning; Trajectory planning; Training; Reinforcement learning; Optimization; Aerodynamics; OPTIMIZATION;
D O I
10.1109/TAES.2022.3218496
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This article presents a technique for trajectory planning based on parameterized high-level actions. These high-level actions are subtrajectories that have variable shape and duration. The use of high-level actions can improve the performance of guidance algorithms. Specifically, we show how the use of high-level actions improves the performance of guidance policies that are generated via reinforcement learning (RL). RL has shown great promise for solving complex control, guidance, and coordination problems but can still suffer from long training times and poor performance. This work shows how the use of high-level actions reduces the required number of training steps and increases the path performance of an RL-trained guidance policy. We demonstrate the method on a space-shuttle guidance example. We show the proposed method increases the path performance (latitude range) by 18% compared with a baseline RL implementation. Similarly, we show the proposed method achieves steady state during training with approximately 75% fewer training steps. We also show how the guidance policy enables effective performance in an obstacle field. Finally, this article develops a loss function term for policy-gradient-based deep RL, which is analogous to an antiwindup mechanism in feedback control. We demonstrate that the inclusion of this term in the underlying optimization increases the average policy return in our numerical example.
引用
收藏
页码:2513 / 2529
页数:17
相关论文
共 50 条
  • [21] Deep Reinforcement Learning for Jointly Resource Allocation and Trajectory Planning in UAV-Assisted Networks
    Jwaifel, Arwa Mahmoud
    Van Do, Tien
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023, 2023, 14162 : 71 - 83
  • [22] Goal-Conditioned Hierarchical Reinforcement Learning With High-Level Model Approximation
    Luo, Yu
    Ji, Tianying
    Sun, Fuchun
    Liu, Huaping
    Zhang, Jianwei
    Jing, Mingxuan
    Huang, Wenbing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2705 - 2719
  • [23] UAV Trajectory Planning in Wireless Sensor Networks for Energy Consumption Minimization by Deep Reinforcement Learning
    Zhu, Botao
    Bedeer, Ebrahim
    Nguyen, Ha H.
    Barton, Robert
    Henry, Jerome
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (09) : 9540 - 9554
  • [24] Energy-Optimal Trajectory Planning for Near-Space Solar-Powered UAV Based on Hierarchical Reinforcement Learning
    Xu, Tichao
    Wu, Di
    Meng, Wenyue
    Ni, Wenjun
    Zhang, Zijian
    IEEE ACCESS, 2024, 12 (21420-21436) : 21420 - 21436
  • [25] Trajectory Design and Resource Allocation for Multi-UAV Networks: Deep Reinforcement Learning Approaches
    Chang, Zheng
    Deng, Hengwei
    You, Li
    Min, Geyong
    Garg, Sahil
    Kaddoum, Georges
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (05): : 2940 - 2951
  • [26] DeepGame-TP: Integrating Dynamic Game Theory and Deep Learning for Trajectory Planning
    Lucente, Giovanni
    Maarssoe, Mikkel Skov
    Konthala, Sanath Himasekhar
    Abulehia, Anas
    Dariani, Reza
    Schindler, Julian
    IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 5 : 873 - 888
  • [27] Fast and slow curiosity for high-level exploration in reinforcement learning
    Nicolas Bougie
    Ryutaro Ichise
    Applied Intelligence, 2021, 51 : 1086 - 1107
  • [28] Fast and slow curiosity for high-level exploration in reinforcement learning
    Bougie, Nicolas
    Ichise, Ryutaro
    APPLIED INTELLIGENCE, 2021, 51 (02) : 1086 - 1107
  • [29] Deep Reinforcement Learning Approach for Joint Trajectory Design in Multi-UAV IoT Networks
    Xu, Shu
    Zhan, Xiangyu
    Li, Chunguo
    Wang, Dongming
    Yang, Luxi
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (03) : 3389 - 3394
  • [30] Optimizing Robotic Task Sequencing and Trajectory Planning on the Basis of Deep Reinforcement Learning
    Dong, Xiaoting
    Wan, Guangxi
    Zeng, Peng
    Song, Chunhe
    Cui, Shijie
    BIOMIMETICS, 2024, 9 (01)