Correctness-guaranteed strategy synthesis and compression for multi-agent autonomous systems

被引:8
作者
Gu, Rong [1 ]
Jensen, Peter G. [2 ]
Seceleanu, Cristina [1 ]
Enoiu, Eduard [1 ]
Lundqvist, Kristina [1 ]
机构
[1] Malardalen Univ, Vasteras, Sweden
[2] Aalborg Univ, Aalborg, Denmark
关键词
Planning; Multi-agent autonomous systems; Timed games; Reinforcement learning; Strategy compression; ALGORITHMS; TIME;
D O I
10.1016/j.scico.2022.102894
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Planning is a critical function of multi-agent autonomous systems, which includes path finding and task scheduling. Exhaustive search-based methods such as model checking and algorithmic game theory can solve simple instances of multi-agent planning. However, these methods suffer from state-space explosion when the number of agents is large. Learning-based methods can alleviate this problem, but lack a guarantee of correctness of the results. In this paper, we introduce MoCReL, a new version of our previously proposed method that combines model checking with reinforcement learning in solving the planning problem. The approach takes advantage of reinforcement learning to synthesize path plans and task schedules for large numbers of autonomous agents, and of model checking to verify the correctness of the synthesized strategies. Further, MoCReL can compress large strategies into smaller ones that have down to 0.05% of the original sizes, while preserving their correctness, which we show in this paper. MoCReL is integrated into a new version of UPPAAL STRATEGO that supports calling external libraries when running learning and verification of timed games models.(c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页数:25
相关论文
共 49 条
  • [1] Scheduling with timed automata
    Abdeddaïm, Y
    Asarin, E
    Maler, O
    [J]. THEORETICAL COMPUTER SCIENCE, 2006, 354 (02) : 272 - 300
  • [2] A THEORY OF TIMED AUTOMATA
    ALUR, R
    DILL, DL
    [J]. THEORETICAL COMPUTER SCIENCE, 1994, 126 (02) : 183 - 235
  • [3] Compositional Synthesis of Reactive Controllers for Multi-agent Systems
    Alur, Rajeev
    Moarref, Salar
    Topcu, Ufuk
    [J]. COMPUTER AIDED VERIFICATION: 28TH INTERNATIONAL CONFERENCE, CAV 2016, PT II, 2016, 9780 : 251 - 269
  • [4] Andersen M.S., 2004, IFAC Proceedings Volumes, V37, P597
  • [5] dtControl 2.0: Explainable Strategy Representation via Decision Tree Learning Steered by Experts
    Ashok, Pranav
    Jackermeier, Mathias
    Kretinsky, Jan
    Weinhuber, Christoph
    Weininger, Maximilian
    Yadav, Mayank
    [J]. TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS, PT II, TACAS 2021, 2021, 12652 : 326 - 345
  • [6] Ashok P., 2020, P 23 INT C HYBR SYST, P1
  • [7] Current advancements on autonomous mission planning and management systems: An AUV and UAV perspective
    Atyabi, Adham
    MahmoudZadeh, Somaiyeh
    Nefti-Meziani, Samia
    [J]. ANNUAL REVIEWS IN CONTROL, 2018, 46 : 196 - 215
  • [8] Baier C, 2008, PRINCIPLES OF MODEL CHECKING, P1
  • [9] Basile D., 2020, INT C FORM TECHN DIS
  • [10] Behjati R, 2010, LECT NOTES COMPUT SC, V5961, P292