Tacit Commitments Emergence in Multi-agent Reinforcement Learning

被引：0

作者：

Li, Boyin ^{[1
,2
]}

Pu, Zhiqiang ^{[1
,2
]}

Gao, Junlong ^{[3
]}

Yi, Jianqiang ^{[1
,2
]}

Guo, Zhenyu ^{[3
]}

机构：

[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China

[3] Alibaba Grp, Hangzhou, Peoples R China

来源：

NEURAL INFORMATION PROCESSING, PT I, ICONIP 2022 | 2023年 / 13623卷

基金：

中国国家自然科学基金;

关键词：

Casual Inference; Counterfactual Reasoning; Intrinsic Reward; Multi-agent Systems; Reinforcement Learning;

D O I：

10.1007/978-3-031-30105-6_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Tacit commitments have been widely seen as a crucial underpinning for real-world cooperation. Similarly, it could also be a key to multi-agent cooperation. This paper proposes a novel tacit commitment emergence multi-agent reinforcement learning (MARL) framework (TCEM). In MARL, we define commitment as the unique state that the agent will exhibit through its action. TCEM first equips each agent with a commitment inference module (CIM) to infer its neighbor's commitments. Then, TCEM proposes that commitments influence intrinsic motivation (CIR) to encourage agents to have casual influence on others' actions. Finally, commitment acceptance intrinsic (CAI) motivation is constructed to guide the agent in behaving considering neighbors' commitments. CIR and CAI calculate intrinsic reward using counterfactual reasoning deriving from causal inference. Empirical results show that our method can effectively improve learning performance and deliver better cooperation among agents, which helps our method show superior performance on the Google Research Football benchmark.

引用

页码：27 / 36

页数：10

共 19 条

[1] Beliefs and endogenous cognitive levels: An experimental study [J].

Agranov, Marina ;

Potamites, Elizabeth ;

Schotter, Andrew ;

Tergiman, Chloe .

GAMES AND ECONOMIC BEHAVIOR, 2012, 75 (02) :449-463

[2]

Chenghao Li, 2021, Advances in Neural Informa- tion Processing Systems, V34

[3]

Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974

[4]

Iqbal S, 2019, PR MACH LEARN RES, V97

[5]

Jiang J., 2021, INT C MACHINE LEARNI, P4992

[6]

Kurach K, 2020, Arxiv, DOI arXiv:1907.11180

[7]

Lowe R, 2017, ADV NEUR IN, V30

[8]

Mahajan A, 2019, ADV NEUR IN, V32

[9]

Melkonyan T., 2018, The cognitive foundations of tacit com- mitments

[10] Tacit knowledge sharing within project teams: an application of social commitments theory [J].

Ngoc Lan Nguyen .

VINE JOURNAL OF INFORMATION AND KNOWLEDGE MANAGEMENT SYSTEMS, 2024, 54 (01) :43-58

← 1 2 →