Proximal policy optimization for formation navigation and obstacle avoidance

被引：0

作者：

Priyam Sadhukhan

Rastko R. Selmic

机构：

[1] Concordia University,Electrical and Computer Engineering

来源：

International Journal of Intelligent Robotics and Applications | 2022年 / 6卷

关键词：

Multi-agent formation; Proximal policy algorithm; Second order agents; Obstacle avoidance;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this paper, a formation control problem of second-order holonomic agents is considered, where agents navigate around obstacles using proximal policy optimization (PPO)-based deep reinforcement learning (DRL). The formation is allowed to shrink and expand, while maintaining its shape, in order to navigate the geometric centroid of the formation towards the goal. A bearing-based reward function is presented that depends on the bearing error of each agent towards its designated neighbors. The agents share a single policy that is trained in a centralized manner. Distance measurements, state information, error information regarding neighboring agents, and simulation information are used for training the policy in an end-to-end fashion. Simulation results using the proposed approach are compared with that obtained using an angle-based reward function.

引用

页码：746 / 759

页数：13

共 28 条

[1] Dong X(2015)Time-varying formation control for unmanned aerial vehicles: theories and applications IEEE Trans. Contr. Syst. Technol. 23 340-348
[2] Yu B(2018)DeepMimic: example-guided deep reinforcement learning of physics-based character skills ACM Trans. Graph. 37 1-14
[3] Shi Z(2018)Distributed maneuvering of autonomous surface vehicles based on neurodynamic optimization and fuzzy approximation IEEE Trans. Contr. Syst. Technol. 26 1083-1090
[4] Zhong Y(2020)Multi-agent motion planning for dense and dynamic environments via deep reinforcement learning IEEE Robot. Autom. Lett. 5 3221-3226
[5] Peng XB(2021)A novel cooperative control system of multi-missile formation under uncontrollable speed IEEE Access 9 9753-9770
[6] Abbeel P(2020)Research on the multiagent joint proximal policy optimization algorithm controlling cooperative fixed-wing UAV obstacle avoidance Sensors 20 4546-438
[7] Levine S(2017)Translational and scaling formation maneuver control via a bearing-based approach IEEE Trans. Control Netw. Syst. 4 429-165278
[8] van de Panne M(2019)Learn to navigate: cooperative path planning for unmanned surface vehicles using deep reinforcement learning IEEE Access 7 165262-150406
[9] Peng Z(2020)Multi-robot flocking control based on deep reinforcement learning IEEE Access 8 150397-undefined
[10] Wang J(undefined)undefined undefined undefined undefined-undefined

← 1 2 3 →