Policy-Based Reinforcement Learning for Assortative Matching in Human Behavior Modeling

被引：0

作者：

Deng, Ou ^{[1
]}

Jin, Qun ^{[1
]}

机构：

[1] Waseda Univ, Grad Sch Human Sci, Tokorozawa, Japan

来源：

DIGITAL HUMAN MODELING AND APPLICATIONS IN HEALTH, SAFETY, ERGONOMICS AND RISK MANAGEMENT, DHM 2023, PT II | 2023年 / 14029卷

关键词：

Multiagent system; Reinforcement learning; Game theory; Human behavior modeling;

D O I：

10.1007/978-3-031-35748-0_28

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper explores human behavior in virtual networked communities, specifically individuals or groups' potential and expressive capacity to respond to internal and external stimuli, with assortative matching as a typical example. A modeling approach based on Multi-Agent Reinforcement Learning (MARL) is proposed, adding a multi-head attention function to the A3C algorithm to enhance learning effectiveness. This approach simulates human behavior in certain scenarios through various environmental parameter settings and agent action strategies. In our experiment, reinforcement learning is employed to serve specific agents that learn from environment status and competitor behaviors, optimizing strategies to achieve better results. The simulation includes individual and group levels, displaying possible paths to forming competitive advantages. This modeling approach provides a means for further analysis of the evolutionary dynamics of human behavior, communities, and organizations in various socioeconomic issues.

引用

页码：378 / 391

页数：14

共 13 条

[1] Ariely D, 2010, UPSIDE IRRATIONALITY
[2] A comprehensive survey of multiagent reinforcement learning
Busoniu, Lucian
Babuska, Robert
De Schutter, Bart
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02): : 156 - 172
[3] Ferguson T.S., 1989, Stat. Sci., V4, P282, DOI [10.1214/ss/1177012493, DOI 10.1214/SS/1177012493]
[4] Haarnoja T, 2018, PR MACH LEARN RES, V80
[5] Hoffman M., P 34 INT C MACH LEAR
[6] A game theory-reinforcement learning (GT-RL) method to develop optimal operation policies for multi-operator reservoir systems
Madani, Kaveh
Hooshyar, Milad
[J]. JOURNAL OF HYDROLOGY, 2014, 519 : 732 - 742
[7] Mnih V., 2013, PLAYING ATARI DEEP R, V1312, P5602, DOI DOI 10.48550/ARXIV.1312.5602
[8] Mnih V, 2016, PR MACH LEARN RES, V48
[9] Human-level control through deep reinforcement learning
Mnih, Volodymyr
Kavukcuoglu, Koray
Silver, David
Rusu, Andrei A.
Veness, Joel
Bellemare, Marc G.
Graves, Alex
Riedmiller, Martin
Fidjeland, Andreas K.
Ostrovski, Georg
Petersen, Stig
Beattie, Charles
Sadik, Amir
Antonoglou, Ioannis
King, Helen
Kumaran, Dharshan
Wierstra, Daan
Legg, Shane
Hassabis, Demis
[J]. NATURE, 2015, 518 (7540) : 529 - 533
[10] Nowé A, 2012, ADAPT LEARN OPTIM, V12, P441

← 1 2 →