Inferring strategies from observations in long iterated Prisoner's dilemma experiments

被引：7

作者：

Montero-Porras, Eladio ^{[1
]}

Grujic, Jelena ^{[1
]}

Domingos, Elias Fernandez ^{[1
,2
]}

Lenaerts, Tom ^{[1
,2
,3
,4
]}

机构：

[1] Vrije Univ Brussel, Artificial Intelligence Lab, B-1050 Brussels, Belgium

[2] Univ Libre Bruxelles, Machine Learning Grp, B-1050 Brussels, Belgium

[3] Univ Calif Berkeley, Ctr Human Compatible AI, Berkeley, CA 94702 USA

[4] Vrije Univ Brussel, FARI Inst, Univ Libre Bruxelles, B-1050 Brussels, Belgium

来源：

SCIENTIFIC REPORTS | 2022年 / 12卷 / 01期

关键词：

TIT-FOR-TAT; EVOLUTION; COOPERATION; BEHAVIOR;

D O I：

10.1038/s41598-022-11654-2

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

While many theoretical studies have revealed the strategies that could lead to and maintain cooperation in the Iterated Prisoner's dilemma, less is known about what human participants actually do in this game and how strategies change when being confronted with anonymous partners in each round. Previous attempts used short experiments, made different assumptions of possible strategies, and led to very different conclusions. We present here two long treatments that differ in the partner matching strategy used, i.e. fixed or shuffled partners. Here we use unsupervised methods to cluster the players based on their actions and then Hidden Markov Model to infer what the memory-one strategies are in each cluster. Analysis of the inferred strategies reveals that fixed partner interaction leads to behavioral self-organization. Shuffled partners generate subgroups of memory-one strategies that remain entangled, apparently blocking the self-selection process that leads to fully cooperating participants in the fixed partner treatment. Analyzing the latter in more detail shows that AllC, AllD, TFT- and WSLS-like behavior can be observed. This study also reveals that long treatments are needed as experiments with less than 25 rounds capture mostly the learning phase participants go through in these kinds of experiments.

引用

页数：12

共 58 条

[11] Timing Uncertainty in Collective Risk Dilemmas Encourages Group Reciprocation and Polarization
Domingos, Elias Fernandez
Grujic, Jelena
Burguillo, Juan C.
Kirchsteiger, Georg
Santos, Francisco C.
Lenaerts, Tom
[J]. ISCIENCE, 2020, 23 (12)
[12] Duffy J., 2002, EVOLUTIONARY COMPUTA, DOI [10.1007/978-3-7908-1784-3_4, DOI 10.1007/978-3-7908-1784-3_4]
[13] DO GUPPIES PLAY TIT FOR TAT DURING PREDATOR INSPECTION VISITS
DUGATKIN, LA
[J]. BEHAVIORAL ECOLOGY AND SOCIOBIOLOGY, 1988, 23 (06) : 395 - 399
[14] Inferring repeated-game strategies from actions: evidence from trust game experiments
Engle-Warnick, J
Slonim, R
[J]. ECONOMIC THEORY, 2006, 28 (03) : 603 - 632
[15] The evolution of strategies in a repeated trust game
Engle-Warnick, J
Slonim, RL
[J]. JOURNAL OF ECONOMIC BEHAVIOR & ORGANIZATION, 2004, 55 (04) : 553 - 573
[16] Inferring strategies from observed actions: a nonparametric, binary tree classification approach
Engle-Warnick, J
[J]. JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2003, 27 (11-12) : 2151 - 2170
[17] Engle-Warnick J., 2002, 300500 SSRN, DOI [10.2139/ssrn.300500, DOI 10.2139/SSRN.300500]
[18] Fernndez-Domingos E. etal, 2021, ARXIV210307710 CS
[19] Once Nice, Always Nice? Results on Factors Influencing Nice Behavior from an Iterated Prisoner's Dilemma Experiment
Fleiss, Juergen
Leopold-Wildburger, Ulrike
[J]. SYSTEMS RESEARCH AND BEHAVIORAL SCIENCE, 2014, 31 (02) : 327 - 334
[20] THE FOLK THEOREM IN REPEATED GAMES WITH DISCOUNTING OR WITH INCOMPLETE INFORMATION
FUDENBERG, D
MASKIN, E
[J]. ECONOMETRICA, 1986, 54 (03) : 533 - 554

← 1 2 3 4 5 6 →