Research on Self-Learning Control Method of Reusable Launch Vehicle Based on Neural Network Architecture Search

被引:1
作者
Xue, Shuai [1 ]
Wang, Zhaolei [2 ]
Bai, Hongyang [1 ]
Yu, Chunmei [2 ]
Li, Zian [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Energy & Power Engn, Nanjing 210094, Peoples R China
[2] Beijing Aerosp Automat Control Inst, Natl Key Lab Sci & Technol Aerosp Intelligence Con, Beijing 100854, Peoples R China
基金
中国国家自然科学基金;
关键词
reusable launch vehicle; deep reinforcement learning; neural network architecture search; Bayesian optimization; self-learning control;
D O I
10.3390/aerospace11090774
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Reusable launch vehicles need to face complex and diverse environments during flight. The design of rocket recovery control law based on traditional deep reinforcement learning (DRL) makes it difficult to obtain a set of network architectures that can adapt to multiple scenarios and multi-parameter uncertainties, and the performance of deep reinforcement learning algorithm depends on manual trial and error of hyperparameters. To solve this problem, this paper proposes a self-learning control method for launch vehicle recovery based on neural architecture search (NAS), which decouples deep network structure search and reinforcement learning hyperparameter optimization. First, using network architecture search technology based on a multi-objective hybrid particle swarm optimization algorithm, the proximal policy optimization algorithm of deep network architecture is automatically designed, and the search space is lightweight design in the process. Secondly, in order to further improve the landing accuracy of the launch vehicle, the Bayesian optimization (BO) method is used to automatically optimize the hyperparameters of reinforcement learning, and the control law of the landing phase in the recovery process of the launch vehicle is obtained through training. Finally, the algorithm is transplanted to the rocket intelligent learning embedded platform for comparative testing to verify its online deployment capability. The simulation results show that the proposed method can satisfy the landing accuracy of the launch vehicle recovery mission, and the control effect is basically the same as the landing accuracy of the trained rocket model under the untrained condition of model parameter deviation and wind field interference, which verifies the generalization of the proposed method.
引用
收藏
页数:21
相关论文
共 40 条
  • [1] Alagumuthukrishnan S., 2023, Procedia Computer Science, P1112, DOI 10.1016/j.procs.2023.01.090
  • [2] Model-Based Meta-Reinforcement Learning for Flight With Suspended Payloads
    Belkhale, Suneel
    Li, Rachel
    Kahn, Gregory
    McAllister, Rowan
    Calandra, Roberto
    Levine, Sergey
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02): : 1471 - 1478
  • [3] Bergstra J., 2013, INT C MACHINE LEARNI, V28
  • [4] Advances in intelligent and autonomous navigation systems for small UAS
    Bijjahalli, Suraj
    Sabatini, Roberto
    Gardi, Alessandro
    [J]. PROGRESS IN AEROSPACE SCIENCES, 2020, 115
  • [5] Minimum-Landing-Error Powered-Descent Guidance for Mars Landing Using Convex Optimization
    Blackmore, Lars
    Acikmese, Behcet
    Scharf, Daniel P.
    [J]. JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2010, 33 (04) : 1161 - 1171
  • [6] Chen LJ, 2018, ADV NEUR IN, V31
  • [7] A reinforcement learning approach for adaptive tracking control of a reusable rocket model in a landing scenario
    Costa, Bertinho A.
    Parente, Francisco L.
    Belfo, Joao
    Somma, Nicola
    Rosa, Paulo
    Igreja, Jose M.
    Belhadj, Joris
    Lemos, Joao M.
    [J]. NEUROCOMPUTING, 2024, 577
  • [8] [邓帅 Deng Shuai], 2019, [计算机应用研究, Application Research of Computers], V36, P1984
  • [9] Dong L.J., 2023, Mach. Des. Manuf, V384, P45, DOI [10.19356/j.cnki.1001-3997.20221103.043, DOI 10.19356/J.CNKI.1001-3997.20221103.043]
  • [10] Eberhart R., 1995, P 6 INT S MICR HUM S, P39, DOI [10.1109/MHS.1995.494215, DOI 10.1109/MHS.1995.494215]