Safe Multi-Agent Reinforcement Learning via Approximate Hamilton-Jacobi Reachability

被引:0
|
作者
Kai Zhu [1 ]
Fengbo Lan [1 ]
Wenbo Zhao [1 ]
Tao Zhang [1 ]
机构
[1] Tsinghua University,Department of Automation
[2] Beijing National Research Center for Information Science and Technology,undefined
关键词
Multi-agent systems; Deep reinforcement learning; Safety satisfaction; Hamilton-Jacobi reachability;
D O I
10.1007/s10846-024-02156-6
中图分类号
学科分类号
摘要
Multi-Agent Reinforcement Learning (MARL) promises to address the challenges of cooperation and competition among multiple agents, often involving safety-critical scenarios. However, realizing safe MARL remains a domain of limited progress. Current works extend single-agent safe learning approaches, employing shielding or backup policies to ensure safety satisfaction. Nevertheless, these approaches require good cooperation among multiple agents, and weakly distributed approaches with centralized shielding become infeasible when agents encounter complex situations such as non-cooperative agents and coordination failures. In this paper, we integrate the Hamilton-Jacobi (HJ) reachability theory and present a Centralized Training and Decentralized Execution (CTDE) framework for Safe MARL. Our framework enables the learning of safety policies without the need for system model or shielding layer pre-training. Additionally, we enhance adaptability to varying levels of cooperation through a conservative approximation estimation of the value function. Experimental results validate the efficacy of our proposed method, demonstrating its ability to ensure safety while successfully achieving target tasks under cooperative conditions. Furthermore, our approach exhibits robustness in the face of non-cooperative behaviors induced by complex disturbance factors.
引用
收藏
相关论文
共 50 条
  • [21] Hamilton-Jacobi reachability analysis with running cost function
    Liao, Wei
    Liang, Tao-Tao
    Wei, Xiao-Hui
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2022, 39 (06): : 986 - 994
  • [22] Nonlinear controller design via approximate solution of Hamilton-Jacobi equations
    Mousavere, D
    Kravaris, C
    2005 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL & 13TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION, VOLS 1 AND 2, 2005, : 1143 - 1148
  • [23] Multi-agent reinforcement learning with approximate model learning for competitive games
    Park, Young Joon
    Cho, Yoon Sang
    Kim, Seoung Bum
    PLOS ONE, 2019, 14 (09):
  • [24] A Hamilton-Jacobi Reachability-Based Framework for Predicting and Analyzing Human Motion for Safe Planning
    Bansal, Somil
    Bajcsy, Andrea
    Ratner, Ellis
    Dragan, Anca D.
    Tomlin, Claire J.
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 7149 - 7155
  • [25] Safe multi-agent reinforcement learning for multi-robot control
    Gu, Shangding
    Kuba, Jakub Grudzien
    Chen, Yuanpei
    Du, Yali
    Yang, Long
    Knoll, Alois
    Yang, Yaodong
    ARTIFICIAL INTELLIGENCE, 2023, 319
  • [26] Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
    El Mhamdi, El Mandi
    Guerraoui, Rachid
    Hendrikx, Hadrien
    Maurer, Alexandre
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [27] Removing Leaking Corners to Reduce Dimensionality in Hamilton-Jacobi Reachability
    Lee, Donggun
    Chen, Mo
    Tomlin, Claire J.
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 9320 - 9326
  • [28] On Safety and Liveness Filtering Using Hamilton-Jacobi Reachability Analysis
    Borquez, Javier
    Chakraborty, Kaustav
    Wang, Hao
    Bansal, Somil
    IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 : 4235 - 4251
  • [29] Reachability calculation for aircraft maneuver using Hamilton-Jacobi function
    Liu Y.
    Du G.-X.
    Quan Q.
    Tian Y.-C.
    Zidonghua Xuebao/Acta Automatica Sinica, 2016, 42 (03): : 347 - 357
  • [30] Refining Control Barrier Functions through Hamilton-Jacobi Reachability
    Tonkens, Sander
    Herbert, Sylvia
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 13355 - 13362