Autonomous Trajectory Planning Method for Stratospheric Airship Regional Station-Keeping Based on Deep Reinforcement Learning

被引:6
作者
Liu, Sitong [1 ,2 ]
Zhou, Shuyu [1 ]
Miao, Jinggang [1 ,2 ]
Shang, Hai [1 ]
Cui, Yuxuan [1 ]
Lu, Ying [1 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
基金
国家重点研发计划;
关键词
trajectory planning; stratospheric airship; deep reinforcement learning; proximal policy optimization (PPO); regional station-keeping; VEHICLE;
D O I
10.3390/aerospace11090753
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
The stratospheric airship, as a near-space vehicle, is increasingly utilized in scientific exploration and Earth observation due to its long endurance and regional observation capabilities. However, due to the complex characteristics of the stratospheric wind field environment, trajectory planning for stratospheric airships is a significant challenge. Unlike lower atmospheric levels, the stratosphere presents a wind field characterized by significant variability in wind speed and direction, which can drastically affect the stability of the airship's trajectory. Recent advances in deep reinforcement learning (DRL) have presented promising avenues for trajectory planning. DRL algorithms have demonstrated the ability to learn complex control strategies autonomously by interacting with the environment. In particular, the proximal policy optimization (PPO) algorithm has shown effectiveness in continuous control tasks and is well suited to the non-linear, high-dimensional problem of trajectory planning in dynamic environments. This paper proposes a trajectory planning method for stratospheric airships based on the PPO algorithm. The primary contributions of this paper include establishing a continuous action space model for stratospheric airship motion; enabling more precise control and adjustments across a broader range of actions; integrating time-varying wind field data into the reinforcement learning environment; enhancing the policy network's adaptability and generalization to various environmental conditions; and enabling the algorithm to automatically adjust and optimize flight paths in real time using wind speed information, reducing the need for human intervention. Experimental results show that, within its wind resistance capability, the airship can achieve long-duration regional station-keeping, with a maximum station-keeping time ratio (STR) of up to 0.997.
引用
收藏
页数:18
相关论文
共 50 条
[21]   Time optimal trajectory planning of excavator based on deep reinforcement learning [J].
Zhang Y.-Y. ;
Sun Z.-Y. ;
Sun Q.-L. ;
Wang Y. .
Kongzhi yu Juece/Control and Decision, 2024, 39 (05) :1433-1440
[22]   Trajectory planning for airborne radar in extended target tracking based on deep reinforcement learning [J].
Zhang, Hongyun ;
Chen, Hui ;
Zhang, Wenxu ;
Zhang, Xindi .
DIGITAL SIGNAL PROCESSING, 2024, 153
[23]   Deep Reinforcement Learning Based Trajectory Planning Under Uncertain Constraints [J].
Chen, Lienhung ;
Jiang, Zhongliang ;
Cheng, Long ;
Knoll, Alois C. ;
Zhou, Mingchuan .
FRONTIERS IN NEUROROBOTICS, 2022, 16
[24]   An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway [J].
Mei Zhang ;
Kai Chen ;
Jinhui Zhu .
International Journal of Machine Learning and Cybernetics, 2023, 14 :3483-3499
[25]   An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway [J].
Zhang, Mei ;
Chen, Kai ;
Zhu, Jinhui .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (10) :3483-3499
[26]   A Guided-to-Autonomous Policy Learning method of Deep Reinforcement Learning in Path Planning [J].
Zhao, Wang ;
Zhang, Ye ;
Li, Haoyu .
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA 2024, 2024, :665-672
[27]   Trajectory Planning for Autonomous Vehicles Using Hierarchical Reinforcement Learning [J].
Ben Naveed, Kaleb ;
Qiao, Zhiqian ;
Dolan, John M. .
2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, :601-606
[28]   Stratospheric wind field feature extraction and energy management for hybrid electric solar airship with deep reinforcement learning [J].
Liu, Yang ;
Sun, Kangwen ;
Lv, Mingyun .
SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2024, 71
[29]   Deep Reinforcement Learning With Optimized Reward Functions for Robotic Trajectory Planning [J].
Xie, Jiexin ;
Shao, Zhenzhou ;
Li, Yue ;
Guan, Yong ;
Tan, Jindong .
IEEE ACCESS, 2019, 7 :105669-105679
[30]   Trajectory Planning for Automated Parking Systems Using Deep Reinforcement Learning [J].
Zhuo Du ;
Qiheng Miao ;
Changfu Zong .
International Journal of Automotive Technology, 2020, 21 :881-887