Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning

被引:0
|
作者
Skrynnik, Alexey [1 ,2 ]
Andreychuk, Anton [1 ]
Nesterova, Maria [2 ,3 ]
Yakovlev, Konstantin [1 ,2 ]
Panov, Aleksandr [1 ,3 ]
机构
[1] AIRI, Moscow, Russia
[2] Russian Acad Sci, Fed Res Ctr Comp Sci & Control, Moscow, Russia
[3] MIPT, Dolgoprudnyi, Russia
来源
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16 | 2024年
关键词
REINFORCEMENT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph and is typically solved in a centralized fashion. Conversely, in this work, we investigate the decentralized MAPF setting, when the central controller that possesses all the information on the agents' locations and goals is absent and the agents have to sequentially decide the actions on their own without having access to the full state of the environment. We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones. To address this complex problem, we propose a method that integrates two complementary approaches: planning with heuristic search and reinforcement learning through policy optimization. Planning is utilized to construct and re-plan individual paths. We enhance our planning algorithm with a dedicated technique tailored to avoid congestion and increase the throughput of the system. We employ reinforcement learning to discover the collision avoidance policies that effectively guide the agents along the paths. The policy is implemented as a neural network and is effectively trained without any reward-shaping or external guidance. We evaluate our method on a wide range of setups comparing it to the state-of-the-art solvers. The results show that our method consistently outperforms the learnable competitors, showing higher throughput and better ability to generalize to the maps that were unseen at the training stage. Moreover our solver outperforms a rule-based one in terms of throughput and is an order of magnitude faster than a state-of-the-art search-based solver. The code is available at https://github.com/AIRI-Institute/learn-to-follow.
引用
收藏
页码:17541 / 17549
页数:9
相关论文
共 50 条
  • [1] PRIMAL2: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning-Lifelong
    Damani, Mehul
    Luo, Zhiyao
    Wenzel, Emerson
    Sartoretti, Guillaume
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 2666 - 2673
  • [2] Anytime Lifelong Multi-Agent Pathfinding in Topological Maps
    Song, Soohwan
    Na, Ki-In
    Yu, Wonpil
    IEEE ACCESS, 2023, 11 : 20365 - 20380
  • [3] PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning
    Sartoretti, Guillaume
    Kerr, Justin
    Shi, YunFei
    Wagner, Glenn
    Kumar, T. K. Satish
    Koenig, Sven
    Choset, Howie
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (03): : 2378 - 2385
  • [4] Learning to Schedule in Multi-Agent Pathfinding
    Ahn, Kyuree
    Park, Heemang
    Park, Jinkyoo
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7326 - 7332
  • [5] Multi-agent reinforcement learning as a rehearsal for decentralized planning
    Kraemer, Landon
    Banerjee, Bikramjit
    NEUROCOMPUTING, 2016, 190 : 82 - 94
  • [6] When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding
    Skrynnik, Alexey
    Andreychuk, Anton
    Yakovlev, Konstantin
    Panov, Aleksandr I.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17411 - 17424
  • [7] Evolution of path costs for efficient decentralized multi-agent pathfinding
    Farhadi, Ulrich
    Hess, Henning
    Maoudj, Abderraouf
    Christensen, Anders Lyhne
    SWARM AND EVOLUTIONARY COMPUTATION, 2025, 93
  • [8] Hybrid Policy Learning for Multi-Agent Pathfinding
    Skrynnik, Alexey
    Yakovleva, Alexandra
    Davydov, Vasilii
    Yakovlev, Konstantin
    Panov, Aleksandr I.
    IEEE ACCESS, 2021, 9 : 126034 - 126047
  • [9] Manipulator Motion Planning via Centralized Training and Decentralized Execution Multi-Agent Reinforcement Learning
    Wang, Yuliu
    Sagawa, Ryusuke
    2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022), 2022, : 812 - 817
  • [10] Online Multi-Agent Pathfinding
    Svancara, Jiri
    Vlk, Marek
    Stern, Roni
    Atzmon, Dor
    Bartak, Roman
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7732 - 7739