Distributed Neural Networks Training for Robotic Manipulation With Consensus Algorithm

被引:12
作者
Liu, Wenxing [1 ]
Niu, Hanlin [1 ]
Jang, Inmo [1 ]
Herrmann, Guido [1 ]
Carrasco, Joaquin [1 ]
机构
[1] Univ Manchester, Dept Elect & Elect Engn, Manchester M13 9PL, Lancs, England
基金
英国工程与自然科学研究理事会;
关键词
Training; Reinforcement learning; Convergence; Task analysis; Robot kinematics; Manipulators; Privacy; Consensus; deep reinforcement learning; Lyapunov methods; manipulator; multiagent systems;
D O I
10.1109/TNNLS.2022.3191021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose an algorithm that combines actor-critic-based off-policy method with consensus-based distributed training to deal with multiagent deep reinforcement learning problems. Specifically, convergence analysis of a consensus algorithm for a type of nonlinear system with a Lyapunov method is developed, and we use this result to analyze the convergence properties of the actor training parameters and the critic training parameters in our algorithm. Through the convergence analysis, it can be verified that all agents will converge to the same optimal model as the training time goes to infinity. To validate the implementation of our algorithm, a multiagent training framework is proposed to train each Universal Robot 5 (UR5) robot arm to reach the random target position. Finally, experiments are provided to demonstrate the effectiveness and feasibility of the proposed algorithm.
引用
收藏
页码:2732 / 2746
页数:15
相关论文
共 50 条
  • [1] NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS
    BARTO, AG
    SUTTON, RS
    ANDERSON, CW
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05): : 834 - 846
  • [2] Beyret B, 2019, IEEE INT C INT ROBOT, P5014, DOI [10.1109/iros40897.2019.8968488, 10.1109/IROS40897.2019.8968488]
  • [3] Practical Secure Aggregation for Privacy-Preserving Machine Learning
    Bonawitz, Keith
    Ivanov, Vladimir
    Kreuter, Ben
    Marcedone, Antonio
    McMahan, H. Brendan
    Patel, Sarvar
    Ramage, Daniel
    Segal, Aaron
    Seth, Karn
    [J]. CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, : 1175 - 1191
  • [4] Six-DOF Spacecraft Optimal Trajectory Planning and Real-Time Attitude Control: A Deep Neural Network-Based Approach
    Chai, Runqi
    Tsourdos, Antonios
    Savvaris, Al
    Chai, Senchun
    Xia, Yuanqing
    Chen, C. L. Philip
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 5005 - 5013
  • [5] Degris T, 2012, ARXIV
  • [6] Certifiable Robustness to Adversarial State Uncertainty in Deep Reinforcement Learning
    Everett, Michael
    Lutjens, Bjorn
    How, Jonathan P.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (09) : 4184 - 4198
  • [7] Foerster JN, 2016, ADV NEUR IN, V29
  • [8] Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
  • [9] Golub GH, 1996, MATRIX COMPUTATIONS, V3rd, DOI [10.56021/9781421407944, DOI 10.56021/9781421407944]
  • [10] A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients
    Grondman, Ivo
    Busoniu, Lucian
    Lopes, Gabriel A. D.
    Babuska, Robert
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06): : 1291 - 1307