Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning

被引：0

作者：

Skrynnik, Alexey ^{[1
,2
]}

Andreychuk, Anton ^{[1
]}

Nesterova, Maria ^{[2
,3
]}

Yakovlev, Konstantin ^{[1
,2
]}

Panov, Aleksandr ^{[1
,3
]}

机构：

[1] AIRI, Moscow, Russia

[2] Russian Acad Sci, Fed Res Ctr Comp Sci & Control, Moscow, Russia

[3] MIPT, Dolgoprudnyi, Russia

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16 | 2024年

关键词：

REINFORCEMENT;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph and is typically solved in a centralized fashion. Conversely, in this work, we investigate the decentralized MAPF setting, when the central controller that possesses all the information on the agents' locations and goals is absent and the agents have to sequentially decide the actions on their own without having access to the full state of the environment. We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones. To address this complex problem, we propose a method that integrates two complementary approaches: planning with heuristic search and reinforcement learning through policy optimization. Planning is utilized to construct and re-plan individual paths. We enhance our planning algorithm with a dedicated technique tailored to avoid congestion and increase the throughput of the system. We employ reinforcement learning to discover the collision avoidance policies that effectively guide the agents along the paths. The policy is implemented as a neural network and is effectively trained without any reward-shaping or external guidance. We evaluate our method on a wide range of setups comparing it to the state-of-the-art solvers. The results show that our method consistently outperforms the learnable competitors, showing higher throughput and better ability to generalize to the maps that were unseen at the training stage. Moreover our solver outperforms a rule-based one in terms of throughput and is an order of magnitude faster than a state-of-the-art search-based solver. The code is available at https://github.com/AIRI-Institute/learn-to-follow.

引用

页码：17541 / 17549

页数：9

共 50 条

[1] PRIMAL2: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning-Lifelong
Damani, Mehul
Luo, Zhiyao
Wenzel, Emerson
Sartoretti, Guillaume
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 2666 - 2673
[2] Anytime Lifelong Multi-Agent Pathfinding in Topological Maps
Song, Soohwan
Na, Ki-In
Yu, Wonpil
IEEE ACCESS, 2023, 11 : 20365 - 20380
[3] PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning
Sartoretti, Guillaume
Kerr, Justin
Shi, YunFei
Wagner, Glenn
Kumar, T. K. Satish
Koenig, Sven
Choset, Howie
IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (03): : 2378 - 2385
[4] Learning to Schedule in Multi-Agent Pathfinding
Ahn, Kyuree
Park, Heemang
Park, Jinkyoo
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7326 - 7332
[5] Multi-agent reinforcement learning as a rehearsal for decentralized planning
Kraemer, Landon
Banerjee, Bikramjit
NEUROCOMPUTING, 2016, 190 : 82 - 94
[6] When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding
Skrynnik, Alexey
Andreychuk, Anton
Yakovlev, Konstantin
Panov, Aleksandr I.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17411 - 17424
[7] Evolution of path costs for efficient decentralized multi-agent pathfinding
Farhadi, Ulrich
Hess, Henning
Maoudj, Abderraouf
Christensen, Anders Lyhne
SWARM AND EVOLUTIONARY COMPUTATION, 2025, 93
[8] Hybrid Policy Learning for Multi-Agent Pathfinding
Skrynnik, Alexey
Yakovleva, Alexandra
Davydov, Vasilii
Yakovlev, Konstantin
Panov, Aleksandr I.
IEEE ACCESS, 2021, 9 : 126034 - 126047
[9] Manipulator Motion Planning via Centralized Training and Decentralized Execution Multi-Agent Reinforcement Learning
Wang, Yuliu
Sagawa, Ryusuke
2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022), 2022, : 812 - 817
[10] Online Multi-Agent Pathfinding
Svancara, Jiri
Vlk, Marek
Stern, Roni
Atzmon, Dor
Bartak, Roman
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7732 - 7739

← 1 2 3 4 5 →