Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions

被引:13
作者
Dong, Bo [1 ,2 ]
An, Tianjiao [1 ]
Zhou, Fan [1 ]
Liu, Keping [1 ]
Yu, Weibo [1 ]
Li, Yuanchun [1 ]
机构
[1] Changchun Univ Technol, Dept Control Sci & Engn, Changchun 130012, Jilin, Peoples R China
[2] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive dynamic programming; collision identification; decentralized optimal control; modular robot manipulators; zero-sum game; NONLINEAR-SYSTEMS; ROBUST-CONTROL; POLICY ITERATION; TORQUE; JOINT; STABILIZATION; POSITION; DESIGN;
D O I
10.1109/ACCESS.2019.2927511
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a decentralized zero-sum optimal control method for MRMs with environmental collisions via an actor-critic-identifier (ACI) structure-based adaptive dynamic programming (ADP) algorithm. The dynamic model of the MRMs is formulated via a novel collision identification method that is deployed for each joint module, in which the local position and torque information are used to design the model compensation controller. A neural network (NN) identifier is developed to compensate the model uncertainties and then, the optimal control problem of the MRMs with environmental collisions can be transformed into a two-player zero-sum optimal control one. Based on the ADP algorithm, the HamiltonJacobi-Isaacs (HJI) equation is solved by constructing the actor-critic NNs, thus making the derivation of the approximate optimal control policy feasible. Based on the Lyapunov theory, the closed-loop robotic system is proved to be asymptotically stable. Finally, the experiments are conducted to verify the effectiveness and advantages of the proposed method.
引用
收藏
页码:96148 / 96165
页数:18
相关论文
共 59 条
[1]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank .
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, :38-+
[2]   A unified passivity-based control framework for position, torque and impedance control of flexible joint robots [J].
Albu-Schaeffer, Alin ;
Ott, Christian ;
Hirzinger, Gerd .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2007, 26 (01) :23-39
[3]  
[Anonymous], 2003, CONTROL ENGN SER BIR
[4]   A SURVEY OF MODELS, ANALYSIS TOOLS AND COMPENSATION METHODS FOR THE CONTROL OF MACHINES WITH FRICTION [J].
ARMSTRONGHELOUVRY, B ;
DUPONT, P ;
DEWIT, CC .
AUTOMATICA, 1994, 30 (07) :1083-1138
[5]  
Basar T., 1995, H1 Optimal Control and Related Minimax Design Problems, V2nd
[6]  
Basar T., 1998, Dynamic Noncooperative Game Theory
[7]   A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems [J].
Bhasin, S. ;
Kamalapurkar, R. ;
Johnson, M. ;
Vamvoudakis, K. G. ;
Lewis, F. L. ;
Dixon, W. E. .
AUTOMATICA, 2013, 49 (01) :82-92
[8]  
Cai N., 2017, J SYST SCI COMPLEX, V45, P1
[9]  
De Luca A, 2005, IEEE INT CONF ROBOT, P999
[10]  
De Luca A, 2006, 2006 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-12, P1623