A multi-armed bandit algorithm speeds up the evolution of cooperation

被引:0
|
作者
Gatti, Roberto Cazzolla [1 ,2 ]
机构
[1] Konrad Lorenz Inst Evolut & Cognit Res, Klosterneuburg, Austria
[2] Tomsk State Univ, Biol Inst, Tomsk, Russia
关键词
Evolution of cooperation; Multi-armed bandit algorithm; Epsilon-greedy model; Matryoshka model; TIT-FOR-TAT; KIN SELECTION; RECIPROCAL ALTRUISM; OPTIMIZATION; RECOGNITION; COMPETITION; BIODIVERSITY; NETWORKS; DEFENSE; RED;
D O I
10.1016/j.ecolmodel.2020.109348
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Most evolutionary biologists consider selfishness an intrinsic feature of our genes and as the best choice in social situations. During the last years, prolific research has been conducted on the mechanisms that can allow cooperation to emerge "in a world of defectors" to become an evolutionarily stable strategy. A big debate started with the proposal by W.D. Hamilton of "kin selection" in terms of cost sustained by the cooperators and benefits received by related conspecifics. After this, four other main rules for the evolution of cooperation have been suggested. However, one of the main problems of these five rules is the assumption that the payoffs obtained by either cooperating or defeating are quite well known by the parties before they interact and do not change during the time or after repeated encounters. This is not always the case in real life. Following each rule blindly, there is a risk for individuals to get stuck in an unfavorable situation. Axelrod (1984) highlighted that the main problem is how to obtain benefits from cooperation without passing through several trials and errors, which are slow and painful. With a better understanding of this process, individuals can use their foresight to speed up the evolution of cooperation. Here I show that a multi-armed bandit (MAB) model, a classic problem in decision sciences, is naturally employed by individuals to opt for the best choice most of the time, accelerating the evolution of the altruistic behavior and solving the abovementioned problems. A common MAB model that applies extremely well to the evolution of cooperation is the epsilon-greedy (epsilon-greedy) algorithm. This algorithm, after an initial period of exploration (which can be considered as biological history), greedily exploits the best option epsilon% of the time and explores other options the remaining percentage of times (1-epsilon%). Through the epsilon-greedy decision making algorithm, cooperation evolves as a multilevel process nested in the hierarchical levels that exist among the five rules for the evolution of cooperation. This reinforcement learning, a subtype of artificial intelligence, with trials and errors, provides a powerful tool to better understand and even probabilistically quantify the chances cooperation has to evolve in a specific situation.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Improving throughput using multi-armed bandit algorithm for wireless LANs
    Kuroda, Kaori
    Kato, Hiroki
    Kim, Song-Ju
    Naruse, Makoto
    Hasegawa, Mikio
    IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2018, 9 (01): : 74 - 81
  • [2] Combinatorial Multi-Armed Bandit with General Reward Functions
    Chen, Wei
    Hu, Wei
    Li, Fu
    Li, Jian
    Liu, Yu
    Lu, Pinyan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [3] Interface Design Optimization as a Multi-Armed Bandit Problem
    Lomas, J. Derek
    Forlizzi, Jodi
    Poonwala, Nikhil
    Patel, Nirmal
    Shodhan, Sharan
    Patel, Kishan
    Koedinger, Ken
    Brunskill, Emma
    34TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2016, 2016, : 4142 - 4153
  • [4] Multi-armed Bandit Algorithms for Adaptive Learning: A Survey
    Mui, John
    Lin, Fuhua
    Dewan, M. Ali Akber
    ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2021), PT II, 2021, 12749 : 273 - 278
  • [5] Optimal activation of halting multi-armed bandit models
    Cowan, Wesley
    Katehakis, Michael N. N.
    Ross, Sheldon M. M.
    NAVAL RESEARCH LOGISTICS, 2023, 70 (07) : 639 - 652
  • [6] Efficient wireless network selection by using multi-armed bandit algorithm for mobile terminals
    Oshima, Koji
    Onishi, Takuma
    Kim, Song-Ju
    Ma, Jing
    Hasegawa, Mikio
    IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2020, 11 (01): : 68 - 77
  • [7] Multi-armed bandit algorithm for sequential experiments of molecular properties with dynamic feature selection
    Abedin, Md. Menhazul
    Tabata, Koji
    Matsumura, Yoshihiro
    Komatsuzaki, Tamiki
    JOURNAL OF CHEMICAL PHYSICS, 2024, 161 (01)
  • [8] Multi-objective Contextual Multi-armed Bandit With a Dominant Objective
    Tekin, Cem
    Turgay, Eralp
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (14) : 3799 - 3813
  • [9] Design of Multi-Armed Bandit-Based Routing for in-Network Caching
    Tabei, Gen
    Ito, Yusuke
    Kimura, Tomotaka
    Hirata, Kouji
    IEEE ACCESS, 2023, 11 : 82584 - 82600
  • [10] Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems
    Vakili, Sattar
    Liu, Keqin
    Zhao, Qing
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2013, 7 (05) : 759 - 767