Modeling behavioral experiments on uncertainty and cooperation with population-based reinforcement learning

被引:9
作者
Domingos, Elias Fernandez [1 ,2 ,3 ]
Grujic, Jelena [1 ,2 ]
Burguillo, Juan C. [3 ]
Santos, Francisco C. [2 ,4 ,5 ]
Lenaerts, Tom [1 ,2 ]
机构
[1] Vrije Univ Brussel, Comp Sci Dept, Artificial Intelligence Lab, B-1050 Brussels, Belgium
[2] Univ Libre Bruxelles, Dept Informat, Machine Learning Grp, B-1050 Brussels, Belgium
[3] Univ Vigo, atlanTTic Res Ctr, Vigo 36310, Spain
[4] Univ Lisbon, INESC ID, P-2744016 Porto Salvo, Portugal
[5] Univ Lisbon, Inst Super Tecn, P-2744016 Porto Salvo, Portugal
关键词
Public goods game; Population dynamics; Individual learning; Collective risk; Uncertainty;
D O I
10.1016/j.simpat.2021.102299
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
From climate action to public health measures, human collective endeavors are often shaped by different uncertainties. Here we introduce a novel population-based learning model wherein a group of individuals facing a collective risk dilemma acquire their strategies over time through reinforcement learning, while handling different sources of uncertainty. In such an N-person collective risk dilemma players make step-wise contributions to avoid a catastrophe that would result in a loss of wealth for all players. Success is attained if they collectively reach a certain contribution level over time, or, when the threshold is not reached, they were lucky enough to avoid the cataclysm. The dilemma lies in the trade-off between the proportion of personal contributions that players wish to give to collectively reach the goal and the remainder of the wealth they can keep at the end of the game. We show that the strategies learned with the model correspond to those experimentally observed, even when there is uncertainty about either the risk of failing when the goal is not reached, the magnitude of the threshold to attain and the time available to reach the target. We furthermore confirm that being unsure about the time-window favors more extreme reactions and polarization, diminishing the number of agents that contribute fairly. The population-based on-line learning framework we propose is general enough to be applicable in a wide range of collective action problems and arbitrarily large sets of available policies.
引用
收藏
页数:16
相关论文
共 63 条
  • [1] Immediate action is the best strategy when facing uncertain climate change
    Abou Chakra, Maria
    Bumann, Silke
    Schenk, Hanna
    Oschlies, Andreas
    Traulsen, Arne
    [J]. NATURE COMMUNICATIONS, 2018, 9
  • [2] Under high stakes and uncertainty the rich should lend the poor a helping hand
    Abou Chakra, Maria
    Traulsen, Arne
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2014, 341 : 123 - 130
  • [3] Evolutionary Dynamics of Strategic Behavior in a Collective-Risk Dilemma
    Abou Chakra, Maria
    Traulsen, Arne
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2012, 8 (08)
  • [4] Alsabti K., 1997, WORKING PAPER
  • [5] [Anonymous], 1998, THEORY LEARNING GAME
  • [6] [Anonymous], 2017, ADV NEURAL INFORM PR
  • [7] THE FURTHER EVOLUTION OF COOPERATION
    AXELROD, R
    DION, D
    [J]. SCIENCE, 1988, 242 (4884) : 1385 - 1390
  • [8] Caring for the future can turn tragedy into comedy for long-term collective action under risk of collapse
    Barfuss, Wolfra
    Donges, Jonathan F.
    Vasconcelos, Vitor V.
    Kurths, Juergen
    Levin, Simon A.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (23) : 12915 - 12922
  • [9] Deterministic limit of temporal difference reinforcement learning for stochastic games
    Barfuss, Wolfram
    Donges, Jonathan F.
    Kurths, Juergen
    [J]. PHYSICAL REVIEW E, 2019, 99 (04)
  • [10] When optimization for governing human-environment tipping elements is neither sustainable nor safe
    Barfuss, Wolfram
    Donges, Jonathan F.
    Lade, Steven J.
    Kurths, Juergen
    [J]. NATURE COMMUNICATIONS, 2018, 9