Policy-Based Reinforcement Learning for Assortative Matching in Human Behavior Modeling

被引:0
作者
Deng, Ou [1 ]
Jin, Qun [1 ]
机构
[1] Waseda Univ, Grad Sch Human Sci, Tokorozawa, Japan
来源
DIGITAL HUMAN MODELING AND APPLICATIONS IN HEALTH, SAFETY, ERGONOMICS AND RISK MANAGEMENT, DHM 2023, PT II | 2023年 / 14029卷
关键词
Multiagent system; Reinforcement learning; Game theory; Human behavior modeling;
D O I
10.1007/978-3-031-35748-0_28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper explores human behavior in virtual networked communities, specifically individuals or groups' potential and expressive capacity to respond to internal and external stimuli, with assortative matching as a typical example. A modeling approach based on Multi-Agent Reinforcement Learning (MARL) is proposed, adding a multi-head attention function to the A3C algorithm to enhance learning effectiveness. This approach simulates human behavior in certain scenarios through various environmental parameter settings and agent action strategies. In our experiment, reinforcement learning is employed to serve specific agents that learn from environment status and competitor behaviors, optimizing strategies to achieve better results. The simulation includes individual and group levels, displaying possible paths to forming competitive advantages. This modeling approach provides a means for further analysis of the evolutionary dynamics of human behavior, communities, and organizations in various socioeconomic issues.
引用
收藏
页码:378 / 391
页数:14
相关论文
共 13 条
  • [1] Ariely D, 2010, UPSIDE IRRATIONALITY
  • [2] A comprehensive survey of multiagent reinforcement learning
    Busoniu, Lucian
    Babuska, Robert
    De Schutter, Bart
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02): : 156 - 172
  • [3] Ferguson T.S., 1989, Stat. Sci., V4, P282, DOI [10.1214/ss/1177012493, DOI 10.1214/SS/1177012493]
  • [4] Haarnoja T, 2018, PR MACH LEARN RES, V80
  • [5] Hoffman M., P 34 INT C MACH LEAR
  • [6] A game theory-reinforcement learning (GT-RL) method to develop optimal operation policies for multi-operator reservoir systems
    Madani, Kaveh
    Hooshyar, Milad
    [J]. JOURNAL OF HYDROLOGY, 2014, 519 : 732 - 742
  • [7] Mnih V., 2013, PLAYING ATARI DEEP R, V1312, P5602, DOI DOI 10.48550/ARXIV.1312.5602
  • [8] Mnih V, 2016, PR MACH LEARN RES, V48
  • [9] Human-level control through deep reinforcement learning
    Mnih, Volodymyr
    Kavukcuoglu, Koray
    Silver, David
    Rusu, Andrei A.
    Veness, Joel
    Bellemare, Marc G.
    Graves, Alex
    Riedmiller, Martin
    Fidjeland, Andreas K.
    Ostrovski, Georg
    Petersen, Stig
    Beattie, Charles
    Sadik, Amir
    Antonoglou, Ioannis
    King, Helen
    Kumaran, Dharshan
    Wierstra, Daan
    Legg, Shane
    Hassabis, Demis
    [J]. NATURE, 2015, 518 (7540) : 529 - 533
  • [10] Nowé A, 2012, ADAPT LEARN OPTIM, V12, P441