Causal Reinforcement Learning in Iterated Prisoner's Dilemma

被引：3

作者：

Kazemi, Yosra ^{[1
]}

Chanel, Caroline P. C. ^{[2
]}

Givigi, Sidney ^{[1
]}

机构：

[1] Queens Univ, Sch Comp, Kingston, ON K7L 2N8, Canada

[2] Univ Toulouse, Inst Super Aeronaut & Espace ISAE SUPAERO, Dept Design & Control Aerosp Vehicles, F-31013 Toulouse, France

来源：

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS | 2024年 / 11卷 / 02期

关键词：

~Causal inference; game theory; prisoner's dilemma (PD); reinforcement learning (RL); social dilemma;

D O I：

10.1109/TCSS.2023.3289470

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The iterated prisoner's dilemma (IPD) is an archetypal paradigm to model cooperation and has guided studies on social dilemmas. In this work, we develop a causal reinforcement learning (CRL) strategy in a PD game. An agent is designed to have an explicit causal representation of other agents playing strategies from the Axelrod tournament. The collection of policies is assembled in an ensemble RL to choose the best strategy. The agent is then tested against selected Axelrod tournament strategies as well as an adaptive agent trained using traditional RL. Results show that our agent is able to play against all other players and score higher while being adaptive in situations where the strategy of the other players' changes. Furthermore, the decision taken by the agent can be explained in terms of the causal representation of the interactions. Based on the decision made by the agent, a human observer can understand the chosen strategy.

引用

页码：2523 / 2534

页数：12

共 50 条

[1] Softening and Hardening in the Iterated Prisoner's Dilemma
Mathieu, Philippe
Delahaye, Jean-Paul
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (02): : 654 - 663
[2] Domination in Iterated Prisoner's Dilemma
Brown, Joseph Alexander
Ashlock, Daniel A.
2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 1125 - 1128
[3] A forgiving strategy for the Iterated Prisoner's Dilemma
Riordan, CO
JASSS-THE JOURNAL OF ARTIFICIAL SOCIETIES AND SOCIAL SIMULATION, 2000, 3 (04): : U45 - +
[4] PREFERENCE AND EVOLUTION IN THE ITERATED PRISONER’S DILEMMA
王先甲
刘伟兵
Acta Mathematica Scientia, 2009, (02) : 456 - 464
[5] Reactive means in the iterated Prisoner's dilemma
Molnar, Grant
Hammond, Caroline
Fu, Feng
APPLIED MATHEMATICS AND COMPUTATION, 2023, 458
[6] PREFERENCE AND EVOLUTION IN THE ITERATED PRISONER'S DILEMMA
Wang Xianjia
Liu Weibing
ACTA MATHEMATICA SCIENTIA, 2009, 29 (02) : 456 - 464
[7] Overview of a Tarskian Solution to the Iterated Prisoner's Dilemma
Foster, Christopher N.
LOGICA YEARBOOK 2012, 2013, : 15 - 21
[8] New Winning Strategies for the Iterated Prisoner's Dilemma
Mathieu, Philippe
Delahaye, Jean-Paul
JASSS-THE JOURNAL OF ARTIFICIAL SOCIETIES AND SOCIAL SIMULATION, 2017, 20 (04):
[9] New Winning Strategies for the Iterated Prisoner's Dilemma
Mathieu, Philippe
Delahaye, Jean-Paul
PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1665 - 1666
[10] Cooperation in rats playing the iterated Prisoner's Dilemma game
Wood, Ruth I.
Kim, Jessica Y.
Li, Grace R.
ANIMAL BEHAVIOUR, 2016, 114 : 27 - 35

← 1 2 3 4 5 →