Energy-Optimal Trajectory Planning for Near-Space Solar-Powered UAV Based on Hierarchical Reinforcement Learning

被引:2
作者
Xu, Tichao [1 ,2 ,3 ]
Wu, Di [1 ]
Meng, Wenyue [1 ,3 ]
Ni, Wenjun [1 ,2 ]
Zhang, Zijian [1 ,3 ]
机构
[1] Chinese Acad Sci, Inst Engn Thermophys, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Aeronaut & Astronaut, Beijing 100049, Peoples R China
[3] Natl Key Lab Sci & Technol Adv Light Duty Gas Turb, Beijing 100190, Peoples R China
关键词
Autonomous aerial vehicles; Trajectory planning; Training; Reinforcement learning; Mathematical models; Aircraft; Task analysis; Aerospace engineering; Aerospace control; Energy management; Flight strategy; guidance and control; near-space solar-powered aircraft; hierarchical reinforcement learning; energy management; trajectory planning; AIRCRAFT; OPTIMIZATION; STRATEGY;
D O I
10.1109/ACCESS.2024.3359901
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the key technologies for achieving day and night flight, tracking solar peak, and reducing flight energy consumption for a near-space solar-powered unmanned aerial vehicle (UAV) is trajectory planning. However, the environmental differences faced by the near-space solar-powered UAV during long-term flight pose challenges to its online trajectory planning. This article introduces a hierarchical guidance method designed using a hierarchical reinforcement learning algorithm, which includes a two-layer neural network structure of bottom-level trajectory planning models and a top-level decision model. The top-level decision maker selects the appropriate bottom-level planner based on flight and current environmental information, while the planner outputs thrust, attack angle, and bank angle commands based on the input information. This hierarchical guidance structure can improve the UAV's adaptability to energy environment variations and realize an autonomous flight based on energy maximization in long-term missions. Flight simulations spanning spring, summer and autumn seasons show that the guidance controller is able to switch flight policies on its own as the environment changes, allowing the UAV to maximize energy gain on each day, thereby achieving the best energy management strategy in long-term flight. The simulation results also verify the over-fitting and under-fitting effects of the neural network in the solar UAV trajectory planning task, providing support for the necessity of hierarchical guidance.
引用
收藏
页码:21420 / 21436
页数:17
相关论文
共 32 条
[1]  
Asselin Mario., 1997, An Introduction to Aircraft Performance
[2]  
Bolandhemmat H, 2019, 2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), P1486, DOI [10.23919/ECC.2019.8796240, 10.23919/ecc.2019.8796240]
[3]  
Chang M., 2013, Research on design methodology and flighe dynamics of solar-powered straospheric aircraft from low to middle latitudes
[4]  
Cheng X.-F., 2020, Informatization Res., V46, P13
[5]   Hierarchical reinforcement learning with the MAXQ value function decomposition [J].
Dietterich, TG .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2000, 13 :227-303
[6]   Optimal Energy Utilization for a Solar-Powered Aircraft Using Sliding-Mode-Based Attitude Control [J].
Dwivedi, Vijay Shankar ;
Salahudden ;
Giri, Dipak K. ;
Ghosh, Ajoy Kanti ;
Kamath, G. M. .
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2021, 57 (01) :105-118
[7]  
Etkin B., 2012, DYNAMICS ATMOSPHERIC
[8]  
[付跃文 Fu Yuewen], 2018, [系统仿真学报, Journal of System Simulation], V30, P4151
[9]  
Gao X.-Z., 2023, J. Aeronaut., V44, P6
[10]   Joint optimization of battery mass and flight trajectory for high-altitude solar-powered aircraft [J].
Gao, Xian-Zhong ;
Hou, Zhong-Xi ;
Guo, Zheng ;
Chen, Xiao-Qing ;
Chen, Xiao-Qian .
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2014, 228 (13) :2439-2451