Correctness-guaranteed strategy synthesis and compression for multi-agent autonomous systems

被引：8

作者：

Gu, Rong ^{[1
]}

Jensen, Peter G. ^{[2
]}

Seceleanu, Cristina ^{[1
]}

Enoiu, Eduard ^{[1
]}

Lundqvist, Kristina ^{[1
]}

机构：

[1] Malardalen Univ, Vasteras, Sweden

[2] Aalborg Univ, Aalborg, Denmark

来源：

SCIENCE OF COMPUTER PROGRAMMING | 2022年 / 224卷

关键词：

Planning; Multi-agent autonomous systems; Timed games; Reinforcement learning; Strategy compression; ALGORITHMS; TIME;

D O I：

10.1016/j.scico.2022.102894

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Planning is a critical function of multi-agent autonomous systems, which includes path finding and task scheduling. Exhaustive search-based methods such as model checking and algorithmic game theory can solve simple instances of multi-agent planning. However, these methods suffer from state-space explosion when the number of agents is large. Learning-based methods can alleviate this problem, but lack a guarantee of correctness of the results. In this paper, we introduce MoCReL, a new version of our previously proposed method that combines model checking with reinforcement learning in solving the planning problem. The approach takes advantage of reinforcement learning to synthesize path plans and task schedules for large numbers of autonomous agents, and of model checking to verify the correctness of the synthesized strategies. Further, MoCReL can compress large strategies into smaller ones that have down to 0.05% of the original sizes, while preserving their correctness, which we show in this paper. MoCReL is integrated into a new version of UPPAAL STRATEGO that supports calling external libraries when running learning and verification of timed games models.(c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

引用

页数：25

共 49 条

[1] Scheduling with timed automata
Abdeddaïm, Y
Asarin, E
Maler, O
[J]. THEORETICAL COMPUTER SCIENCE, 2006, 354 (02) : 272 - 300
[2] A THEORY OF TIMED AUTOMATA
ALUR, R
DILL, DL
[J]. THEORETICAL COMPUTER SCIENCE, 1994, 126 (02) : 183 - 235
[3] Compositional Synthesis of Reactive Controllers for Multi-agent Systems
Alur, Rajeev
Moarref, Salar
Topcu, Ufuk
[J]. COMPUTER AIDED VERIFICATION: 28TH INTERNATIONAL CONFERENCE, CAV 2016, PT II, 2016, 9780 : 251 - 269
[4] Andersen M.S., 2004, IFAC Proceedings Volumes, V37, P597
[5] dtControl 2.0: Explainable Strategy Representation via Decision Tree Learning Steered by Experts
Ashok, Pranav
Jackermeier, Mathias
Kretinsky, Jan
Weinhuber, Christoph
Weininger, Maximilian
Yadav, Mayank
[J]. TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS, PT II, TACAS 2021, 2021, 12652 : 326 - 345
[6] Ashok P., 2020, P 23 INT C HYBR SYST, P1
[7] Current advancements on autonomous mission planning and management systems: An AUV and UAV perspective
Atyabi, Adham
MahmoudZadeh, Somaiyeh
Nefti-Meziani, Samia
[J]. ANNUAL REVIEWS IN CONTROL, 2018, 46 : 196 - 215
[8] Baier C, 2008, PRINCIPLES OF MODEL CHECKING, P1
[9] Basile D., 2020, INT C FORM TECHN DIS
[10] Behjati R, 2010, LECT NOTES COMPUT SC, V5961, P292

← 1 2 3 4 5 →