Proximal policy optimization for formation navigation and obstacle avoidance

被引:0
作者
Priyam Sadhukhan
Rastko R. Selmic
机构
[1] Concordia University,Electrical and Computer Engineering
来源
International Journal of Intelligent Robotics and Applications | 2022年 / 6卷
关键词
Multi-agent formation; Proximal policy algorithm; Second order agents; Obstacle avoidance;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a formation control problem of second-order holonomic agents is considered, where agents navigate around obstacles using proximal policy optimization (PPO)-based deep reinforcement learning (DRL). The formation is allowed to shrink and expand, while maintaining its shape, in order to navigate the geometric centroid of the formation towards the goal. A bearing-based reward function is presented that depends on the bearing error of each agent towards its designated neighbors. The agents share a single policy that is trained in a centralized manner. Distance measurements, state information, error information regarding neighboring agents, and simulation information are used for training the policy in an end-to-end fashion. Simulation results using the proposed approach are compared with that obtained using an angle-based reward function.
引用
收藏
页码:746 / 759
页数:13
相关论文
共 28 条
  • [1] Dong X(2015)Time-varying formation control for unmanned aerial vehicles: theories and applications IEEE Trans. Contr. Syst. Technol. 23 340-348
  • [2] Yu B(2018)DeepMimic: example-guided deep reinforcement learning of physics-based character skills ACM Trans. Graph. 37 1-14
  • [3] Shi Z(2018)Distributed maneuvering of autonomous surface vehicles based on neurodynamic optimization and fuzzy approximation IEEE Trans. Contr. Syst. Technol. 26 1083-1090
  • [4] Zhong Y(2020)Multi-agent motion planning for dense and dynamic environments via deep reinforcement learning IEEE Robot. Autom. Lett. 5 3221-3226
  • [5] Peng XB(2021)A novel cooperative control system of multi-missile formation under uncontrollable speed IEEE Access 9 9753-9770
  • [6] Abbeel P(2020)Research on the multiagent joint proximal policy optimization algorithm controlling cooperative fixed-wing UAV obstacle avoidance Sensors 20 4546-438
  • [7] Levine S(2017)Translational and scaling formation maneuver control via a bearing-based approach IEEE Trans. Control Netw. Syst. 4 429-165278
  • [8] van de Panne M(2019)Learn to navigate: cooperative path planning for unmanned surface vehicles using deep reinforcement learning IEEE Access 7 165262-150406
  • [9] Peng Z(2020)Multi-robot flocking control based on deep reinforcement learning IEEE Access 8 150397-undefined
  • [10] Wang J(undefined)undefined undefined undefined undefined-undefined