Near-Optimal Multi-Agent Learning for Safe Coverage Control

被引：0

作者：

Prajapat, Manish ^{[1
]}

Turchetta, Matteo ^{[1
]}

Zeilinger, Melanie N. ^{[1
]}

Krause, Andreas ^{[1
]}

机构：

[1] Swiss Fed Inst Technol, Zurich, Switzerland

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

基金：

瑞士国家科学基金会;

关键词：

GAUSSIAN-PROCESSES; REGRET BOUNDS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In multi-agent coverage control problems, agents navigate their environment to reach locations that maximize the coverage of some density. In practice, the density is rarely known a priori, further complicating the original NP-hard problem. Moreover, in many applications, agents cannot visit arbitrary locations due to a priori unknown safety constraints. In this paper, we aim to efficiently learn the density to approximately solve the coverage problem while preserving the agents' safety. We first propose a conditionally linear submodular coverage function that facilitates theoretical analysis. Utilizing this structure, we develop MACOPT, a novel algorithm that efficiently trades off the exploration-exploitation dilemma due to partial observability, and show that it achieves sublinear regret. Next, we extend results on single-agent safe exploration to our multi-agent setting and propose SAFEMAC for safe coverage and exploration. We analyze SAFEMAC and give first of its kind results: near optimal coverage in finite time while provably guaranteeing safety. We extensively evaluate our algorithms on synthetic and real problems, including a biodiversity monitoring task under safety constraints, where SAFEMAC outperforms competing methods.

引用

页数：15

共 69 条

[31] Krause A, 2008, J MACH LEARN RES, V9, P235
[32] Krause A, 2014, TRACTABILITY, P71
[33] Submodularity and its Applications in Optimized Information Gathering
Krause, Andreas
Guestrin, Carlos
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (04)
[34] Lattimore T., 2019, C LEARN THEOR, P2111
[35] Lattimore T, 2020, BANDIT ALGORITHMS, P423
[36] Leike J., 2017, Ai safety gridworlds
[37] NONUNIFORM COVERAGE AND CARTOGRAMS
Lekien, Francois
Leonard, Naomi Ehrich
[J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2009, 48 (01) : 351 - 372
[38] Leskovec J, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P420
[39] Liu Iou-Jen, 2021, International Conference on Machine Learning, P6826
[40] LLOYD SP, 1982, IEEE T INFORM THEORY, V28, P129, DOI 10.1109/TIT.1982.1056489

← 1 2 3 4 5 6 7 →