Distributed cooperative control with collision avoidance for spacecraft swarm reconfiguration via reinforcement learning

被引:7
作者
Sun, Jun [1 ,2 ]
Meng, Yizhen [1 ,2 ]
Huang, Jing [1 ,2 ]
Liu, Fucheng [1 ,2 ]
Li, Shuang [3 ]
机构
[1] Shanghai Aerosp Control Technol Inst, Shanghai 201109, Peoples R China
[2] Shanghai Key Lab Aerosp Intelligent Control Techno, Shanghai 201109, Peoples R China
[3] Nanjing Univ Aeronaut & Astronaut, Coll Astronaut, Nanjing 21106, Peoples R China
基金
中国国家自然科学基金;
关键词
Spacecraft swarm reconfiguration; Distributed cooperative control; Reinforcement learning; Back -stepping control; Soft and hard constraints; TRACKING CONTROL; SYSTEMS;
D O I
10.1016/j.actaastro.2023.01.017
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This article investigates an adaptive, distributed, and cooperative control strategy for the problem of spacecraft swarm reconfiguration, which involves assembling the spacecraft at a close distance to one another while avoiding collisions and keeping spacecraft far from an obstacle. The key idea is to transform their opposite indices into equivalent ones by using soft and hard constraints. The proposed control strategy is inspired by the actor-critic framework of reinforcement learning (RL) algorithms: The soft constraint is designed by using a critic neural network (NN) for assembling and avoiding obstacles, while collisions among the spacecraft are prevented based on the hard constraint established in an artificial potential field (APF). By drawing support from this idea of equivalent transformation, the adaptive, distributed, and cooperative controller is devised by using an actor NN of the RL algorithm, an APF, and Backstepping control technology. The action NNs are used to estimate the input signals of the desired control and the undesired effects due to disturbance from the APF, and the expected control performance is then obtained by minimizing the output of the critic NN. The computational burden incurred by the NNs is significantly reduced by reducing the number of parameters that need to be learned by NNs. Lyapunov stability theory is used to guarantee that all signals in this closed-loop system are ultimately uniformly bounded to ensure its stability. The results of simulations of a swarm of spacecraft demonstrated the effectiveness of the proposed control strategy.
引用
收藏
页码:95 / 109
页数:15
相关论文
共 31 条
[1]  
Alfriend KT, 2010, SPACECRAFT FORMATION FLYING: DYNAMICS, CONTROL, AND NAVIGATION, P1, DOI 10.1016/B978-0-7506-8533-7.00206-2
[2]   NN Reinforcement Learning Adaptive Control for a Class of Nonstrict-Feedback Discrete-Time Systems [J].
Bai, Weiwei ;
Li, Tieshan ;
Tong, Shaocheng .
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (11) :4573-4584
[3]  
Chen H., 2021, COMPUTEWR SCI COMPUT, V6, P1, DOI [10.48550/arXiv.2109.12932, DOI 10.48550/ARXIV.2109.12932]
[4]   Satellite Formation-Containment Flying Control with Collision Avoidance [J].
Chen, Liangming ;
Guo, Yanning ;
Li, Chuanjiang ;
Huang, Jing .
JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2018, 15 (05) :253-270
[5]   Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors [J].
Duan, Jingliang ;
Guan, Yang ;
Li, Shengbo Eben ;
Ren, Yangang ;
Sun, Qi ;
Cheng, Bo .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) :6584-6598
[6]   Robust Connectivity Preserving Rendezvous of Multirobot Systems Under Unknown Dynamics and Disturbances [J].
Feng, Zhi ;
Sun, Chao ;
Hu, Guoqiang .
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2017, 4 (04) :725-735
[7]  
Gao F, 2018, IEEE INT CONF ROBOT, P344
[8]   Adaptive guidance and integrated navigation with reinforcement meta-learning [J].
Gaudet, Brian ;
Linares, Richard ;
Furfaro, Roberto .
ACTA ASTRONAUTICA, 2020, 169 :180-190
[9]   Integral Reinforcement Learning-Based Adaptive NN Control for Continuous-Time Nonlinear MIMO Systems With Unknown Control Directions [J].
Guo, Xinxin ;
Yan, Weisheng ;
Cui, Rongxin .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11) :4068-4077
[10]   Sufficient conditions for connectivity maintenance and rendezvous in leader-follower networks [J].
Gustavi, Tove ;
Dimarogonas, Dimos V. ;
Egerstedt, Magnus ;
Hu, Xiaoming .
AUTOMATICA, 2010, 46 (01) :133-139