Deep Reinforcement Learning design of safe, stable and robust control for sloshing-affected space launch vehicles

被引:0
作者
Cocaul, Pericles [1 ]
Bertrand, Sylvain [1 ]
Piet-Lahanier, Helene [1 ]
Lemazurier, Lori [2 ]
Ganet, Martine [3 ]
机构
[1] Univ Paris Saclay, Dept Traitement Informat & Syst, ONERA, Av Vauve Granges, F-91123 Palaiseau, France
[2] ArianeGroup SAS, 66 Route Verneuil, F-78130 Les Mureaux, France
[3] Formerly ArianeGroup SAS, F-92400 Courbevoie, France
关键词
Launch vehicles control; Reinforcement learning; CONTROL BARRIER;
D O I
10.1016/j.conengprac.2025.106328
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
New challenges in spatial missions and the design of new launchers entail a focus on innovative control strategies. Recent developments in Machine Learning (ML) for optimization processes shed light on the possibilities offered for controlling complex nonlinear partially unknown systems. This work focuses on the use of these methods to design control laws stabilizing the sloshing of propellants in tanks during launcher flight. A major hurdle in applying control laws designed by Artificial Intelligence (AI) to safety-critical systems lies in certifying stability and safety. Using Control Lyapunov Function (CLF) and Control Barrier Function (CBF) developed in Control Theory approaches, closed-loop stability and safety in terms of state constraints can be verified. Considering a Deep Reinforcement Learning (DRL) framework, an algorithm is developed to learn control policy along with stability and safety certificates. The CLF and CBF conditions are integrated in the DRL algorithm, bridging the gap between Control Theory and Machine Learning techniques. A safe and stable DRL controller is then learned on a simulated launcher subject to sloshing with uncertainties and perturbations due to sloshing. A robustness study with Monte Carlo simulations is conducted to evaluate performance under various conditions. Finally, the developed controller is validated on an industrial simulator that more accurately models the real behavior of the launcher. Despite not being trained on this industrial simulator, the controller matches control objectives, demonstrating robustness and performance.
引用
收藏
页数:12
相关论文
共 56 条
[1]   Formal Synthesis of Lyapunov Neural Networks [J].
Abate, Alessandro ;
Ahmed, Daniele ;
Giacobbe, Mirco ;
Peruffo, Andrea .
IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (03) :773-778
[2]  
Achiam J, 2017, PR MACH LEARN RES, V70
[3]  
ADLER JM, 1991, P SOC PHOTO-OPT INS, V1480, P11, DOI 10.1117/12.46489
[4]  
Altman E., 2021, Constrained Markov decision processes
[5]   Control Barrier Function Based Quadratic Programs for Safety Critical Systems [J].
Ames, Aaron D. ;
Xu, Xiangru ;
Grizzle, Jessy W. ;
Tabuada, Paulo .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) :3861-3876
[6]  
Apkarian P., 2012, 2012 16 INT C SYST T, P1
[7]   Sliding Mode Observer Based Sliding Mode Controller for Slosh-Free Motion Through PID Scheme [J].
Bandyopadhyay, B. ;
Gandhi, P. S. ;
Kurode, Shailaja .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2009, 56 (09) :3432-3442
[8]   Full Model-Free Control Architecture for Hybrid UAVs [J].
Barth, Jacson M. O. ;
Condomines, Jean-Philippe ;
Moschetta, Jean-Marc ;
Cabarbaye, Aurelien ;
Join, Cedric ;
Fliess, Michel .
2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, :71-78
[9]  
Bello I, 2017, PR MACH LEARN RES, V70
[10]  
Berkenkamp F, 2017, ADV NEUR IN, V30