HELSA: Hierarchical Reinforcement Learning with Spatiotemporal Abstraction for Large-Scale Multi-Agent Path Finding

被引:1
|
作者
Song, Zhaoyi [1 ]
Zhang, Rongqing [1 ]
Cheng, Xiang [2 ]
机构
[1] Tongji Univ, Sch Software Engn, Shanghai 200092, Peoples R China
[2] Peking Univ, Sch Elect, Beijing 100871, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/IROS55552.2023.10342261
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Multi-Agent Path Finding (MAPF) problem is a critical challenge in dynamic multi-robot systems. Recent studies have revealed that multi-agent reinforcement learning (MARL) is a promising approach to solving MAPF problems in a fully decentralized manner. However, as the size of the multi-robot system increases, sample inefficiency becomes a major impediment to learning-based methods. This paper presents a hierarchical reinforcement learning (HRL) framework for large-scale multi-agent path finding, featuring applying spatial and temporal abstraction to capture intermediate reward and thus encourage efficient exploration. Specifically, we introduce a meta controller that partitions the map into interconnected regions and optimizes agents' region-wise paths towards globally better solutions. Additionally, we design a lower-level controller that efficiently solves each sub-problem by incorporating heuristic guidance and an inter-agent communication mechanism with RL-based policies. Our empirical results on test instances of various scales demonstrate that our method outperforms existing approaches in terms of both success rate and makespan.
引用
收藏
页码:7318 / 7325
页数:8
相关论文
共 50 条
  • [21] Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning
    Lin, Kaixiang
    Zhao, Renyu
    Xu, Zhe
    Zhou, Jiayu
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1774 - 1783
  • [22] Evolutionary reinforcement learning algorithm for large-scale multi-agent cooperation and confrontation applications
    Haiying Liu
    ZhiHao Li
    Kuihua Huang
    Rui Wang
    Guangquan Cheng
    Tiexiang Li
    The Journal of Supercomputing, 2024, 80 : 2319 - 2346
  • [23] A Weighted Mean Field Reinforcement Learning Algorithm for Large-Scale Multi-Agent Collaboration
    Xinwei Yuan
    He Wang
    Wenwu Yu
    Guidance,Navigation and Control, 2023, (02) : 42 - 60
  • [24] Pacesetter Learning for Large Scale Cooperative Multi-Agent Reinforcement Learning
    Zhou, Pingqi
    Li, Chao
    Qiu, Mengwei
    Liu, Jun
    Ma, Chennan
    Yan, Ming
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 115 - 126
  • [25] Multi-Agent Path Finding for Large Agents
    Li, Jiaoyang
    Surynek, Pavel
    Felner, Ariel
    Ma, Hang
    Kumar, T. K. Satish
    Koenig, Sven
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7627 - 7634
  • [26] Planning and Learning in Multi-Agent Path Finding
    Yakovlev, K. S.
    Andreychuk, A. A.
    Skrynnik, A. A.
    Panov, A. I.
    DOKLADY MATHEMATICS, 2022, 106 (SUPPL 1) : S79 - S84
  • [27] Planning and Learning in Multi-Agent Path Finding
    K. S. Yakovlev
    A. A. Andreychuk
    A. A. Skrynnik
    A. I. Panov
    Doklady Mathematics, 2022, 106 : S79 - S84
  • [28] Multi-Agent Reinforcement Learning for Resource Allocation in Large-Scale Robotic Warehouse Sortation Centers
    Shen, Yi
    McClosky, Benjamin
    Durham, Joseph W.
    Zavlanos, Michael M.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 7137 - 7143
  • [29] Multi-Agent Deep Reinforcement Learning for Large-scale Platoon Coordination with Partial Information at Hubs
    Wei, Dixiao
    Yi, Peng
    Lei, Jinlong
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 6242 - 6248
  • [30] Large-Scale Multi-Agent Reinforcement Learning Using Image-Based State Representation
    Chu, Tianshu
    Qu, Shuhui
    Wang, Jie
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 7592 - 7597