Neural network-based reinforcement learning control for combined spacecraft attitude tracking maneuvers

被引：16

作者：

Liu, Yuhan ^{[1
]}

Ma, Guangfu ^{[1
]}

Lyu, Yueyong ^{[1
]}

Wang, Pengyu ^{[1
]}

机构：

[1] Harbin Inst Technol, Sch Astronaut, Harbin 150001, Peoples R China

来源：

NEUROCOMPUTING | 2022年 / 484卷

关键词：

Combined spacecraft; Attitude tracking; Reinforcement learning; Q-learning; TARGET; IDENTIFICATION; POSTCAPTURE; ROBOT; SYSTEMS;

D O I：

10.1016/j.neucom.2021.07.099

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a novel reinforcement learning-based attitude tracking control strategy for combined spacecraft takeover maneuvers with completely unknown dynamics. One major issue in the context of combined spacecraft attitude takeover control is that the accurate dynamic model is highly nonlinear, complex and costly to identify online, which makes it impractical for control design. To address this issue, we take the advantage of the Q-learning algorithm to acquire the control strategy directly from system input/output measurement data in a model-free manner, and thus the online inertia parameter identification procedure is avoided. More specifically, first, the attitude tracking is formulated as a regulation problem by introducing an argumented system, where the system dynamic model is still required in control design. Then, in order to achieve a model-free control strategy, an online policy iteration (PI) Q-learning procedure is derived to solve the Bellman optimality equation by utilizing the generated measurement data. In theoretical analysis, it is proved that the iteration sequences of Q value function and control strategy can converge to the optimal ones. In addition, rigorous proof of the stability and monotonicity guarantees of the proposed control strategy are also provided. Furthermore, for the purpose of online implementation, off-policy learning scheme is employed to find the optimal Q value function approximator with neural network structure after data-collection phase. Numerical simulations are exhibited to validate the effectiveness of the proposed strategy.(c) 2021 Elsevier B.V. All rights reserved.

引用

页码：67 / 78

页数：12

共 50 条

[31] Adaptive Attitude Control of Combined Spacecraft With Large Parametric Uncertainties and Adversarial Disturbance
Guo, Xincheng
Meng, Zhongjie
Jia, Cheng
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2025, 61 (01) : 632 - 641
[32] Fuzzy neural control of satellite attitude by TD based reinforcement learning
Cui, Xiao-ting
Liu, Xiang-dong
WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 3983 - +
[33] Time-Varying Sliding Mode Control for Spacecraft Attitude Tracking Maneuvers with a Quadratic Cost
Cong Binglong
Liu Xiangdong
Chen Zhen
Xia Yongjiang
2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 2527 - 2532
[34] Dynamic infinity-norm constrained control allocation for attitude tracking control of overactuated combined spacecraft
Huang, Xiuwei
Duan, Guang-Ren
IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (11) : 1692 - 1703
[35] Active attitude fault-tolerant tracking control of flexible spacecraft via the Chebyshev neural network
Lu, Kunfeng
Li, Tianya
Zhang, Lijun
TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2019, 41 (04) : 925 - 933
[36] Improving Convolutional Neural Network-Based Webshell Detection Through Reinforcement Learning
Wu, Yalun
Song, Minglu
Li, Yike
Tian, Yunzhe
Tong, Endong
Niu, Wenjia
Jia, Bowei
Huang, Haixiang
Li, Qiong
Liu, Jiqiang
INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2021), PT I, 2021, 12918 : 368 - 383
[37] Attitude Tracking Control With Constraints for Rigid Spacecraft Based on Control Barrier Lyapunov Functions
Wu, Yu-Yao
Sun, Hui-Jie
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2022, 58 (03) : 2053 - 2062
[38] A laguerre neural network-based ADP learning scheme with its application to tracking control in the Internet of Things
Luo, Xiong
Lv, Yixuan
Zhou, Mi
Wang, Weiping
Zhao, Wenbing
PERSONAL AND UBIQUITOUS COMPUTING, 2016, 20 (03) : 361 - 372
[39] Chattering-free adaptive iterative learning for attitude tracking control of uncertain spacecraft
Zhang, Fan
Meng, Deyuan
Li, Xuefang
AUTOMATICA, 2023, 151
[40] Neural Network-Based Adaptive Control for Spacecraft Under Actuator Failures and Input Saturations
Zhou, Ning
Kawano, Yu
Cao, Ming
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (09) : 3696 - 3710

← 1 2 3 4 5 →