Optimal robust online tracking control for space manipulator in task space using off-policy reinforcement learning

被引:1
作者
Zhuang, Hongji [1 ]
Zhou, Hang [1 ]
Shen, Qiang [1 ]
Wu, Shufan [1 ,2 ]
Razoumny, Vladimir Yu. [2 ]
Razoumny, Yury N. [2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai 200240, Peoples R China
[2] Peoples Friendship Univ Russia, RUDN Univ, Moscow 117198, Russia
基金
中国国家自然科学基金;
关键词
Task space; Reinforcement learning; Online tracking control; H-INFINITY CONTROL; ROBOT MANIPULATORS; NONLINEAR-SYSTEMS; TIME-SYSTEMS; APPROXIMATION; STATE;
D O I
10.1016/j.ast.2024.109446
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This study addresses the demands for adaptability, uncertainty management, and high performance in the control of space manipulators, and the inadequacies in achieving optimal control and handling external uncertainty in task space in previous research. Based on off-policy reinforcement learning, a model-free and time-efficient method for online robust tracking control in task space is devised. To address the complexity of dynamic equations in task space, a mixed-variable approach is adopted to transform the multivariable coupled time- varying problem into a single-variable problem. Subsequently, the optimal control policy is derived with the disturbance convergence, stability, and optimality of the control method being demonstrated. This marks the first instance of achieving robust optimal tracking control in task space for space manipulators. The efficacy and superiority of the presented algorithm are validated through simulation.
引用
收藏
页数:12
相关论文
共 60 条
[1]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[2]   Task-space control for industrial robot manipulators with unknown inner loop control architecture [J].
Ahanda, Joseph Jean-Baptiste Mvogo ;
Aba, Charles Medzo ;
Melingui, Achile ;
Zobo, Bernard Essimbi ;
Merzouki, Rochdi .
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2022, 359 (12) :6286-6310
[3]   Observer-based optimal control method combination with event-triggered strategy for hypersonic morphing vehicle [J].
Bao, Cunyu ;
Wang, Peng ;
He, Ruizhi ;
Tang, Guojian .
AEROSPACE SCIENCE AND TECHNOLOGY, 2023, 136
[4]  
Basar T., 2008, H-optimal control and related minimax design problems-A dynamic game approach
[5]   Successive Galerkin approximation algorithms for nonlinear optimal and robust control [J].
Beard, RW ;
McLain, TW .
INTERNATIONAL JOURNAL OF CONTROL, 1998, 71 (05) :717-743
[6]   Adaptive Dynamic Programming for Stochastic Systems With State and Control Dependent Noise [J].
Bian, Tao ;
Jiang, Yu ;
Jiang, Zhong-Ping .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) :4170-4175
[7]   Distributed satellite system autonomous orbital control with recursive filtering [J].
Burroni, Tomas ;
Thangavel, Kathiravan ;
Servidia, Pablo ;
Sabatini, Roberto .
AEROSPACE SCIENCE AND TECHNOLOGY, 2024, 145
[8]   On the PID tracking control of robot manipulators [J].
Cervantes, I ;
Alvarez-Ramirez, J .
SYSTEMS & CONTROL LETTERS, 2001, 42 (01) :37-46
[9]   Adaptive Jacobian tracking control of robots with uncertainties in kinematic, dynamic and actuator models [J].
Cheah, C. C. ;
Liu, C. ;
Slotine, J. J. E. .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (06) :1024-1029
[10]   Approximate Jacobian adaptive control for robot manipulators [J].
Cheah, CC ;
Liu, C ;
Slotine, JJE .
2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, :3075-3080