Distributed Neural Networks Training for Robotic Manipulation With Consensus Algorithm

被引：12

作者：

Liu, Wenxing ^{[1
]}

Niu, Hanlin ^{[1
]}

Jang, Inmo ^{[1
]}

Herrmann, Guido ^{[1
]}

Carrasco, Joaquin ^{[1
]}

机构：

[1] Univ Manchester, Dept Elect & Elect Engn, Manchester M13 9PL, Lancs, England

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 02期

基金：

英国工程与自然科学研究理事会;

关键词：

Training; Reinforcement learning; Convergence; Task analysis; Robot kinematics; Manipulators; Privacy; Consensus; deep reinforcement learning; Lyapunov methods; manipulator; multiagent systems;

D O I：

10.1109/TNNLS.2022.3191021

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article, we propose an algorithm that combines actor-critic-based off-policy method with consensus-based distributed training to deal with multiagent deep reinforcement learning problems. Specifically, convergence analysis of a consensus algorithm for a type of nonlinear system with a Lyapunov method is developed, and we use this result to analyze the convergence properties of the actor training parameters and the critic training parameters in our algorithm. Through the convergence analysis, it can be verified that all agents will converge to the same optimal model as the training time goes to infinity. To validate the implementation of our algorithm, a multiagent training framework is proposed to train each Universal Robot 5 (UR5) robot arm to reach the random target position. Finally, experiments are provided to demonstrate the effectiveness and feasibility of the proposed algorithm.

引用

页码：2732 / 2746

页数：15

共 50 条

[1] NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS
BARTO, AG
SUTTON, RS
ANDERSON, CW
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05): : 834 - 846
[2] Beyret B, 2019, IEEE INT C INT ROBOT, P5014, DOI [10.1109/iros40897.2019.8968488, 10.1109/IROS40897.2019.8968488]
[3] Practical Secure Aggregation for Privacy-Preserving Machine Learning
Bonawitz, Keith
Ivanov, Vladimir
Kreuter, Ben
Marcedone, Antonio
McMahan, H. Brendan
Patel, Sarvar
Ramage, Daniel
Segal, Aaron
Seth, Karn
[J]. CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, : 1175 - 1191
[4] Six-DOF Spacecraft Optimal Trajectory Planning and Real-Time Attitude Control: A Deep Neural Network-Based Approach
Chai, Runqi
Tsourdos, Antonios
Savvaris, Al
Chai, Senchun
Xia, Yuanqing
Chen, C. L. Philip
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 5005 - 5013
[5] Degris T, 2012, ARXIV
[6] Certifiable Robustness to Adversarial State Uncertainty in Deep Reinforcement Learning
Everett, Michael
Lutjens, Bjorn
How, Jonathan P.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (09) : 4184 - 4198
[7] Foerster JN, 2016, ADV NEUR IN, V29
[8] Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
[9] Golub GH, 1996, MATRIX COMPUTATIONS, V3rd, DOI [10.56021/9781421407944, DOI 10.56021/9781421407944]
[10] A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients
Grondman, Ivo
Busoniu, Lucian
Lopes, Gabriel A. D.
Babuska, Robert
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06): : 1291 - 1307

← 1 2 3 4 5 →