Zero-sum game-based neuro-optimal control of modular robot manipulators with uncertain disturbance using critic only policy iteration

被引：44

作者：

Dong, Bo ^{[1
]}

An, Tianjiao ^{[1
]}

Zhu, Xinye ^{[1
]}

Li, Yuanchun ^{[1
]}

Liu, Keping ^{[1
]}

机构：

[1] Changchun Univ Technol, Dept Control Sci & Engn, Changchun 130012, Jilin, Peoples R China

来源：

NEUROCOMPUTING | 2021年 / 450卷

基金：

中国国家自然科学基金;

关键词：

Modular robot manipulators; Adaptive dynamic programming; Critic only policy iteration; Optimal control; Neural network; Zero-sum differential game; TIME NONLINEAR-SYSTEMS; TRACKING CONTROL; ROBUST-CONTROL; CONTROL SCHEME; DESIGN;

D O I：

10.1016/j.neucom.2021.04.032

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a zero-sum differential game strategy-based neuro-optimal control method is presented via critic only policy iteration-adaptive dynamic programming (COPI-ADP) approach to address optimal trajectory tracking control problem of modular robot manipulators (MRMs) with uncertain disturbance. The dynamic model of modular robot manipulator systems is formulated as an integration of joint subsystems and unknown robotic model uncertainties are identified by the developed linear extension state observer. Then, the optimal control issue of the modular robot manipulator systems with uncertain disturbance is transformed into a two-player zero-sum differential game one. Based on adaptive dynamic programming and policy iteration algorithms, the Hamilton-Jacobi-Issacs (HJI) equation is approximately solved using only critic neural network and thus facilitating the feasible derivation of the approximated optimal control policy. The trajectory of tracking errors of modular robot manipulator system is guaranteed to be uniform ultimate bounded by using the Lyapunov theory. Finally, experiments are provided to demonstrate the advantage and effectiveness of the developed control method. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：183 / 196

页数：14

共 42 条

[1] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].

Abu-Khalaf, M ;

Lewis, FL .

AUTOMATICA, 2005, 41 (05) :779-791

[2]

Basar T., 1999, DYNAMIC NONCOOPERATI

[3]

Basar Tamer, 1995, H-Optimal Control and Related Minimax Design Problems: A dynamic game approach

[4] DYNAMIC PROGRAMMING [J].

BELLMAN, R .

SCIENCE, 1966, 153 (3731) :34-&

[5] Design and Implementation of Deep Neural Network-Based Control for Automatic Parking Maneuver Process [J].

Chai, Runqi ;

Tsourdos, Antonios ;

Savvaris, Al ;

Chai, Senchun ;

Xia, Yuanqing ;

Chen, C. L. Philip .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1400-1413

[6] Six-DOF Spacecraft Optimal Trajectory Planning and Real-Time Attitude Control: A Deep Neural Network-Based Approach [J].

Chai, Runqi ;

Tsourdos, Antonios ;

Savvaris, Al ;

Chai, Senchun ;

Xia, Yuanqing ;

Chen, C. L. Philip .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) :5005-5013

[7] Real-Time Reentry Trajectory Planning of Hypersonic Vehicles: A Two-Step Strategy Incorporating Fuzzy Multiobjective Transcription and Deep Neural Network [J].

Chai, Runqi ;

Tsourdos, Antonios ;

Savvaris, Al ;

Xia, Yuanqing ;

Chai, Senchun .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2020, 67 (08) :6904-6915

[8] A nonlinear disturbance observer for robotic manipulators [J].

Chen, WH ;

Ballance, DJ ;

Gawthrop, PJ ;

O'Reilly, J .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2000, 47 (04) :932-938

[9] On Kalman active observers [J].

Cortesao, Rui .

JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2007, 48 (02) :131-155

[10] Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions [J].

Dong, Bo ;

An, Tianjiao ;

Zhou, Fan ;

Liu, Keping ;

Yu, Weibo ;

Li, Yuanchun .

IEEE ACCESS, 2019, 7 :96148-96165

← 1 2 3 4 5 →