Temporal sampling annealing schemes for receding horizon multi-agent planning

被引：0

作者：

Ma, Aaron ^{[1
]}

Ouimet, Mike ^{[2
]}

Cortes, Jorge ^{[1
]}

机构：

[1] Univ Calif San Diego, La Jolla, CA 92093 USA

[2] Singular Genom, San Diego, CA USA

来源：

ROBOTICS AND AUTONOMOUS SYSTEMS | 2021年 / 143卷

关键词：

Multi-agent planning; Potential games; Reinforcement learning; Simulated annealing; GO; GAME;

D O I：

10.1016/j.robot.2021.103823

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper deals with multi-agent scenarios where individual agents must coordinate their plans in order to efficiently complete a set of tasks. Our strategy formulates the task planning problem as a potential game and uses decentralized stochastic sampling policies to reach a consensus on which sequences of actions agents should take. We execute this over a receding finite time horizon and take special care to discourage agents from breaking promises in the near future, which may cause other agents to unsuccessfully attempt a joint action. At the same time, we allow agents to change plans in the distant future, as this gives time for other agents to adapt their plans, allowing the team to escape locally optimal solutions. To do this we introduce two sampling schemes for new actions: a geometric based scheme, where the probability of sampling a new action increases geometrically in time, and an inference-based sampling scheme, where a convolutional neural network provides recommendations for joint actions. We test the proposed schemes in a cooperative orienteering environment to illustrate their performance and validate the intuition behind their design. (C) 2021 The Author(s). Published by Elsevier B.V.

引用

页数：13

共 39 条

[1] [Anonymous], 2017, ADV NEURAL INFORM PR
[2] [Anonymous], 1991, Game theory
[3] [Anonymous], 1998, THEORY LEARNING GAME
[4] Bertsekas D. P., 2011, Dynamic programming and optimal control, V3rd
[5] THE STATISTICAL-MECHANICS OF STRATEGIC INTERACTION
BLUME, LE
[J]. GAMES AND ECONOMIC BEHAVIOR, 1993, 5 (03) : 387 - 424
[6] Broz F., 2008, Comput. Sci., V8, P1339
[7] Bullo F., 2009, Lectures on Network Systems
[8] Chapman A.C., 2009, AAMAS 2009, V2, P915
[9] Co-Reyes JD, 2018, PR MACH LEARN RES, V80
[10] Cortes Jorge, 2017, SICE Journal of Control, Measurement, and System Integration, V10, P495

← 1 2 3 4 →