Autonomous Trajectory Planning Method for Stratospheric Airship Regional Station-Keeping Based on Deep Reinforcement Learning

被引:6
作者
Liu, Sitong [1 ,2 ]
Zhou, Shuyu [1 ]
Miao, Jinggang [1 ,2 ]
Shang, Hai [1 ]
Cui, Yuxuan [1 ]
Lu, Ying [1 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China
基金
国家重点研发计划;
关键词
trajectory planning; stratospheric airship; deep reinforcement learning; proximal policy optimization (PPO); regional station-keeping; VEHICLE;
D O I
10.3390/aerospace11090753
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
The stratospheric airship, as a near-space vehicle, is increasingly utilized in scientific exploration and Earth observation due to its long endurance and regional observation capabilities. However, due to the complex characteristics of the stratospheric wind field environment, trajectory planning for stratospheric airships is a significant challenge. Unlike lower atmospheric levels, the stratosphere presents a wind field characterized by significant variability in wind speed and direction, which can drastically affect the stability of the airship's trajectory. Recent advances in deep reinforcement learning (DRL) have presented promising avenues for trajectory planning. DRL algorithms have demonstrated the ability to learn complex control strategies autonomously by interacting with the environment. In particular, the proximal policy optimization (PPO) algorithm has shown effectiveness in continuous control tasks and is well suited to the non-linear, high-dimensional problem of trajectory planning in dynamic environments. This paper proposes a trajectory planning method for stratospheric airships based on the PPO algorithm. The primary contributions of this paper include establishing a continuous action space model for stratospheric airship motion; enabling more precise control and adjustments across a broader range of actions; integrating time-varying wind field data into the reinforcement learning environment; enhancing the policy network's adaptability and generalization to various environmental conditions; and enabling the algorithm to automatically adjust and optimize flight paths in real time using wind speed information, reducing the need for human intervention. Experimental results show that, within its wind resistance capability, the airship can achieve long-duration regional station-keeping, with a maximum station-keeping time ratio (STR) of up to 0.997.
引用
收藏
页数:18
相关论文
共 50 条
[31]   Stratospheric wind field feature extraction and energy management for hybrid electric solar airship with deep reinforcement learning [J].
Liu, Yang ;
Sun, Kangwen ;
Lv, Mingyun .
SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2024, 71
[32]   Deep Reinforcement Learning With Optimized Reward Functions for Robotic Trajectory Planning [J].
Xie, Jiexin ;
Shao, Zhenzhou ;
Li, Yue ;
Guan, Yong ;
Tan, Jindong .
IEEE ACCESS, 2019, 7 :105669-105679
[33]   Trajectory Planning for Automated Parking Systems Using Deep Reinforcement Learning [J].
Du, Zhuo ;
Miao, Qiheng ;
Zong, Changfu .
INTERNATIONAL JOURNAL OF AUTOMOTIVE TECHNOLOGY, 2020, 21 (04) :881-887
[34]   Trajectory Planning for Automated Parking Systems Using Deep Reinforcement Learning [J].
Zhuo Du ;
Qiheng Miao ;
Changfu Zong .
International Journal of Automotive Technology, 2020, 21 :881-887
[35]   An Efficiently Convergent Deep Reinforcement Learning-Based Trajectory Planning Method for Manipulators in Dynamic Environments [J].
Li Zheng ;
YaHao Wang ;
Run Yang ;
Shaolei Wu ;
Rui Guo ;
Erbao Dong .
Journal of Intelligent & Robotic Systems, 2023, 107
[36]   An Efficiently Convergent Deep Reinforcement Learning-Based Trajectory Planning Method for Manipulators in Dynamic Environments [J].
Zheng, Li ;
Wang, YaHao ;
Yang, Run ;
Wu, Shaolei ;
Guo, Rui ;
Dong, Erbao .
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2023, 107 (04)
[37]   Anti-collision Trajectory Planning for Satellite Formation Reconstruction Based on Deep Reinforcement Learning [J].
Li, Hongbo ;
Zong, Qun ;
Zhang, Xiuyun .
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, :4672-4677
[38]   Deep reinforcement learning-based rehabilitation robot trajectory planning with optimized reward functions [J].
Wang, Xusheng ;
Xie, Jiexin ;
Guo, Shijie ;
Li, Yue ;
Sun, Pengfei ;
Gan, Zhongxue .
ADVANCES IN MECHANICAL ENGINEERING, 2021, 13 (12)
[39]   Deep reinforcement learning trajectory planning for robotic manipulator based on simulation-efficient training [J].
Zhao, Bin ;
Wu, Yao ;
Wu, Chengdong ;
Sun, Ruohuai .
SCIENTIFIC REPORTS, 2025, 15 (01)
[40]   Deep reinforcement learning based trajectory real-time planning for hypersonic gliding vehicles [J].
Li, Jianfeng ;
Song, Shenmin ;
Shi, Xiaoping .
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2024, 238 (16) :1665-1682