Autonomous Trajectory Planning Method for Stratospheric Airship Regional Station-Keeping Based on Deep Reinforcement Learning

被引：6

作者：

Liu, Sitong ^{[1
,2
]}

Zhou, Shuyu ^{[1
]}

Miao, Jinggang ^{[1
,2
]}

Shang, Hai ^{[1
]}

Cui, Yuxuan ^{[1
]}

Lu, Ying ^{[1
]}

机构：

[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100190, Peoples R China

来源：

AEROSPACE | 2024年 / 11卷 / 09期

基金：

国家重点研发计划;

关键词：

trajectory planning; stratospheric airship; deep reinforcement learning; proximal policy optimization (PPO); regional station-keeping; VEHICLE;

D O I：

10.3390/aerospace11090753

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

The stratospheric airship, as a near-space vehicle, is increasingly utilized in scientific exploration and Earth observation due to its long endurance and regional observation capabilities. However, due to the complex characteristics of the stratospheric wind field environment, trajectory planning for stratospheric airships is a significant challenge. Unlike lower atmospheric levels, the stratosphere presents a wind field characterized by significant variability in wind speed and direction, which can drastically affect the stability of the airship's trajectory. Recent advances in deep reinforcement learning (DRL) have presented promising avenues for trajectory planning. DRL algorithms have demonstrated the ability to learn complex control strategies autonomously by interacting with the environment. In particular, the proximal policy optimization (PPO) algorithm has shown effectiveness in continuous control tasks and is well suited to the non-linear, high-dimensional problem of trajectory planning in dynamic environments. This paper proposes a trajectory planning method for stratospheric airships based on the PPO algorithm. The primary contributions of this paper include establishing a continuous action space model for stratospheric airship motion; enabling more precise control and adjustments across a broader range of actions; integrating time-varying wind field data into the reinforcement learning environment; enhancing the policy network's adaptability and generalization to various environmental conditions; and enabling the algorithm to automatically adjust and optimize flight paths in real time using wind speed information, reducing the need for human intervention. Experimental results show that, within its wind resistance capability, the airship can achieve long-duration regional station-keeping, with a maximum station-keeping time ratio (STR) of up to 0.997.

引用

页数：18

共 37 条

[1] Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles [J].

Aradi, Szilard .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (02) :740-759

[2] Deep Reinforcement Learning A brief survey [J].

Arulkumaran, Kai ;

Deisenroth, Marc Peter ;

Brundage, Miles ;

Bharath, Anil Anthony .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38

[3]

cds.climate.copernicus, Climate data Store

[4] Deep Reinforcement Learning Based Trajectory Planning Under Uncertain Constraints [J].

Chen, Lienhung ;

Jiang, Zhongliang ;

Cheng, Long ;

Knoll, Alois C. ;

Zhou, Mingchuan .

FRONTIERS IN NEUROROBOTICS, 2022, 16

[5] Method for collision avoidance based on deep reinforcement learning with path-speed control for an autonomous ship [J].

Chun, Do-Hyun ;

Roh, Myung-Il ;

Lee, Hye-Won ;

Yu, Donghun .

INTERNATIONAL JOURNAL OF NAVAL ARCHITECTURE AND OCEAN ENGINEERING, 2024, 16

[6] High-Altitude Platforms - Present Situation and Technology Trends [J].

d'Oliveira, Flavio Araripe ;

Lourenco de Melo, Francisco Cristovao ;

Devezas, Tessaleno Campos .

JOURNAL OF AEROSPACE TECHNOLOGY AND MANAGEMENT, 2016, 8 (03) :249-262

[7]

Elshamli A., 2004, Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513), P677, DOI 10.1109/CCECE.2004.1345203

[8] An improved A-Star based path planning algorithm for autonomous land vehicles [J].

Erke, Shang ;

Bin, Dai ;

Yiming, Nie ;

Qi, Zhu ;

Liang, Xiao ;

Dawei, Zhao .

INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2020, 17 (05)

[9] UAV navigation in high dynamic environments: A deep reinforcement learning approach [J].

Guo, Tong ;

Jiang, Nan ;

Li, Biyue ;

Zhu, Xi ;

Wang, Ya ;

Du, Wenbo .

CHINESE JOURNAL OF AERONAUTICS, 2021, 34 (02) :479-489

[10]

Huijuan Wang, 2011, 2011 Second International Conference on Mechanic Automation and Control Engineering, P1067

← 1 2 3 4 →