Near-Optimal Multi-Agent Learning for Safe Coverage Control

被引:0
作者
Prajapat, Manish [1 ]
Turchetta, Matteo [1 ]
Zeilinger, Melanie N. [1 ]
Krause, Andreas [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年
基金
瑞士国家科学基金会;
关键词
GAUSSIAN-PROCESSES; REGRET BOUNDS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-agent coverage control problems, agents navigate their environment to reach locations that maximize the coverage of some density. In practice, the density is rarely known a priori, further complicating the original NP-hard problem. Moreover, in many applications, agents cannot visit arbitrary locations due to a priori unknown safety constraints. In this paper, we aim to efficiently learn the density to approximately solve the coverage problem while preserving the agents' safety. We first propose a conditionally linear submodular coverage function that facilitates theoretical analysis. Utilizing this structure, we develop MACOPT, a novel algorithm that efficiently trades off the exploration-exploitation dilemma due to partial observability, and show that it achieves sublinear regret. Next, we extend results on single-agent safe exploration to our multi-agent setting and propose SAFEMAC for safe coverage and exploration. We analyze SAFEMAC and give first of its kind results: near optimal coverage in finite time while provably guaranteeing safety. We extensively evaluate our algorithms on synthetic and real problems, including a biodiversity monitoring task under safety constraints, where SAFEMAC outperforms competing methods.
引用
收藏
页数:15
相关论文
共 69 条
  • [31] Krause A, 2008, J MACH LEARN RES, V9, P235
  • [32] Krause A, 2014, TRACTABILITY, P71
  • [33] Submodularity and its Applications in Optimized Information Gathering
    Krause, Andreas
    Guestrin, Carlos
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (04)
  • [34] Lattimore T., 2019, C LEARN THEOR, P2111
  • [35] Lattimore T, 2020, BANDIT ALGORITHMS, P423
  • [36] Leike J., 2017, Ai safety gridworlds
  • [37] NONUNIFORM COVERAGE AND CARTOGRAMS
    Lekien, Francois
    Leonard, Naomi Ehrich
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2009, 48 (01) : 351 - 372
  • [38] Leskovec J, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P420
  • [39] Liu Iou-Jen, 2021, International Conference on Machine Learning, P6826
  • [40] LLOYD SP, 1982, IEEE T INFORM THEORY, V28, P129, DOI 10.1109/TIT.1982.1056489