A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning

被引：0

作者：

MA Ye ^{[1
]}

CHANG Tianqing ^{[1
]}

FAN Wenhui ^{[2
]}

机构：

[1] Academy of Army Armored Force

[2] Department of Automation, Tsinghua University

来源：

JournalofSystemsEngineeringandElectronics | 2021年 / 32卷 / 03期

关键词：

multi-agent; reinforcement learning; evolutionary game; Q-learning;

D O I：

暂无

中图分类号：

O225 [对策论（博弈论）]; TP181 [自动推理、机器学习];

学科分类号：

070105 ; 1201 ; 081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the evolutionary game of the same task for groups,the changes in game rules, personal interests, the crowd size,and external supervision cause uncertain effects on individual decision-making and game results. In the Markov decision framework, a single-task multi-decision evolutionary game model based on multi-agent reinforcement learning is proposed to explore the evolutionary rules in the process of a game. The model can improve the result of a evolutionary game and facilitate the completion of the task. First, based on the multi-agent theory, to solve the existing problems in the original model, a negative feedback tax penalty mechanism is proposed to guide the strategy selection of individuals in the group. In addition, in order to evaluate the evolutionary game results of the group in the model, a calculation method of the group intelligence level is defined. Secondly, the Q-learning algorithm is used to improve the guiding effect of the negative feedback tax penalty mechanism. In the model, the selection strategy of the Q-learning algorithm is improved and a bounded rationality evolutionary game strategy is proposed based on the rule of evolutionary games and the consideration of the bounded rationality of individuals. Finally, simulation results show that the proposed model can effectively guide individuals to choose cooperation strategies which are beneficial to task completion and stability under different negative feedback factor values and different group sizes, so as to improve the group intelligence level.

引用

页码：642 / 657

页数：16

共 15 条

[1] 考虑建筑热平衡与柔性舒适度的乡村微能源网电热联合调度 [J].

刘洪 ;

王亦然 ;

李积逊 ;

葛少云 ;

李吉峰 ;

李生山 .

电力系统自动化, 2019, 43 (09) :50-58

[2] 从知识的表达和运用综述强化学习研究 [J].

陈宗海 ;

杨志华 ;

王海波 ;

盛捷 .

控制与决策, 2008, (09) :961-968+975

[3] 强化学习研究综述 [J].

高阳 ;

陈世福 ;

陆鑫 .

自动化学报, 2004, (01) :86-100

[4]

Data-based reinforcement learning approximate optimal control for an uncertain nonlinear system with control effectiveness faults[J] . Patryk Deptula,Zachary I. Bell,Emily A. Doucette,J. Willard Curtis,Warren E. Dixon.Automatica . 2020 (C)

[5]

Multi-agent cooperation q-learning algorithm based on constrained Markov Game[J] . Yangyang Ge,Fei Zhu,Wei Huang,Peiyao Zhao,Quan Liu.Computer Science and Information Systems . 2020 (00)

[6]

Deep Reinforcement Learning With Application to Air Confrontation Intelligent Decision-Making of Manned/Unmanned Aerial Vehicle Cooperative System[J] . Yue Li,Wei Han,Yongqing Wang.IEEE Access . 2020

[7] Multi-agent behavioral control system using deep reinforcement learning [J].

Ngoc Duy Nguyen ;

Thanh Nguyen ;

Nahavandi, Saeid .

NEUROCOMPUTING, 2019, 359 :58-68

[8]

Agent-based restoration approach for reliability with load balancing on smart grids[J] . Yi Ren,Dongming Fan,Qiang Feng,Zili Wang,Bo Sun,Dezhen Yang.Applied Energy . 2019

[9]

Q-RTS: a real-time swarm intelligence based on multi-agent Q-learning[J] . M. Matta,G.C. Cardarilli,L. Di Nunzio,R. Fazzolari,D. Giardino,M. Re,F. Silvestri,S. Spanò.Electronics Letters . 2019 (10)

[10]

A game-theory approach based on genetic algorithm for flexible job shop scheduling problem[J] . Li Nie,Xiaogang Wang,Fangyu Pan.Journal of Physics: Conference Series . 2019 (3)

← 1 2 →