Specification Aware Multi-Agent Reinforcement Learning

被引：3

作者：

Ritz, Fabian ^{[1
]}

Phan, Thomy ^{[1
]}

Mueller, Robert ^{[1
]}

Gabor, Thomas ^{[1
]}

Sedlmeier, Andreas ^{[1
]}

Zeller, Marc ^{[2
]}

Wieghardt, Jan ^{[2
]}

Schmid, Reiner ^{[2
]}

Sauer, Horst ^{[2
]}

Klein, Cornel ^{[2
]}

Linnhoff-Popien, Claudia ^{[1
]}

机构：

[1] Ludwig Maximilians Univ Munchen, Mobile & Distributed Syst Grp, Munich, Germany

[2] Siemens AG, Corp Technol CT, Munich, Germany

来源：

AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2021 | 2022年 / 13251卷

关键词：

Multi-agent; Reinforcement learning; Specification compliance; AI safety;

D O I：

10.1007/978-3-031-10161-8_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Engineering intelligent industrial systems is challenging due to high complexity and uncertainty with respect to domain dynamics and multiple agents. If industrial systems act autonomously, their choices and results must be within specified bounds to satisfy these requirements. Reinforcement learning (RL) is promising to find solutions that outperform known or handcrafted heuristics. However in industrial scenarios, it also is crucial to prevent RL from inducing potentially undesired or even dangerous behavior. This paper considers specification alignment in industrial scenarios with multi-agent reinforcement learning (MARL). We propose to embed functional and non-functional requirements into the reward function, enabling the agents to learn to align with the specification. We evaluate our approach in a smart factory simulation representing an industrial lot-size-one production facility, where we train up to eight agents using DQN, VDN, and QMIX. Our results show that the proposed approach enables agents to satisfy a given set of requirements.

引用

页码：3 / 21

页数：19

共 41 条

[1]

Amodei D, 2016, Arxiv, DOI arXiv:1606.06565

[2]

Belzner L, 2016, 2016 IEEE/ACM 2ND INTERNATIONAL WORKSHOP ON SOFTWARE ENGINEERING FOR SMART CYBER-PHYSICAL SYSTEMS (SESCPS), P54, DOI [10.1109/SEsCPS.2016.017, 10.1145/2897035.2897040]

[3]

Bengio Y., 2009, P 26 ANN INT C MACHI, P41

[4]

Bures Tomas, 2017, ACM SIGSOFT Software Engineering Notes, V42, P19, DOI 10.1145/3089649.3089656

[5]

Chang YH, 2004, ADV NEUR IN, V16, P807

[6] Software Engineering for Self-Adaptive Systems: A Research Roadmap [J].

Cheng, Betty H. C. ;

de Lemos, Rogerio ;

Giese, Holger ;

Inverardi, Paola ;

Magee, Jeff ;

Andersson, Jesper ;

Becker, Basil ;

Bencomo, Nelly ;

Brun, Yuriy ;

Cukic, Bojan ;

Serugendo, Giovanna Di Marzo ;

Dustdar, Schahram ;

Finkelstein, Anthony ;

Gacek, Cristina ;

Geihs, Kurt ;

Grassi, Vincenzo ;

Karsai, Gabor ;

Kienle, Holger M. ;

Kramer, Jeff ;

Litoiu, Marin ;

Malek, Sam ;

Mirandola, Raffaela ;

Mueller, Hausi A. ;

Park, Sooyong ;

Shaw, Mary ;

Tichy, Matthias ;

Tivoli, Massimo ;

Weyns, Danny ;

Whittle, Jon .

SOFTWARE ENGINEERING FOR SELF-ADAPTIVE SYSTEMS, 2009, 5525 :1-+

[7]

Devlin S.M., 2012, P 11 INT C AUTONOMOU, P433

[8]

Devlin S, 2014, AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, P165

[9]

Foerster JN, 2016, ADV NEUR IN, V29

[10]

Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974

← 1 2 3 4 5 →