Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions

被引：13

作者：

Dong, Bo ^{[1
,2
]}

An, Tianjiao ^{[1
]}

Zhou, Fan ^{[1
]}

Liu, Keping ^{[1
]}

Yu, Weibo ^{[1
]}

Li, Yuanchun ^{[1
]}

机构：

[1] Changchun Univ Technol, Dept Control Sci & Engn, Changchun 130012, Jilin, Peoples R China

[2] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

中国国家自然科学基金;

关键词：

Adaptive dynamic programming; collision identification; decentralized optimal control; modular robot manipulators; zero-sum game; NONLINEAR-SYSTEMS; ROBUST-CONTROL; POLICY ITERATION; TORQUE; JOINT; STABILIZATION; POSITION; DESIGN;

D O I：

10.1109/ACCESS.2019.2927511

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a decentralized zero-sum optimal control method for MRMs with environmental collisions via an actor-critic-identifier (ACI) structure-based adaptive dynamic programming (ADP) algorithm. The dynamic model of the MRMs is formulated via a novel collision identification method that is deployed for each joint module, in which the local position and torque information are used to design the model compensation controller. A neural network (NN) identifier is developed to compensate the model uncertainties and then, the optimal control problem of the MRMs with environmental collisions can be transformed into a two-player zero-sum optimal control one. Based on the ADP algorithm, the HamiltonJacobi-Isaacs (HJI) equation is solved by constructing the actor-critic NNs, thus making the derivation of the approximate optimal control policy feasible. Based on the Lyapunov theory, the closed-loop robotic system is proved to be asymptotically stable. Finally, the experiments are conducted to verify the effectiveness and advantages of the proposed method.

引用

页码：96148 / 96165

页数：18

共 59 条

[1] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].