Game Theory-Based Control System Algorithms with Real-Time Reinforcement Learning HOW TO SOLVE MULTIPLAYER GAMES ONLINE

被引：136

作者：

Vamvoudakis, Kyriakos G. ^{[1
]}

Modares, Hamidreza ^{[2
]}

Kiumarsi, Bahare ^{[3
]}

Lewis, Frank L. ^{[4
,5
]}

机构：

[1] Virginia Tech, Dept Aerosp & Ocean Engn, Blacksburg, VA 24061 USA

[2] Missouri Univ Sci & Technol, Rolla, MO USA

[3] Univ Texas Arlington, Arlington, TX 76019 USA

[4] Univ Texas Arlington, Res Inst, Ft Worth, TX USA

[5] Northeastern Univ, Shenyang, Peoples R China

来源：

IEEE CONTROL SYSTEMS MAGAZINE | 2017年 / 37卷 / 01期

关键词：

OPTIMAL TRACKING CONTROL; ZERO-SUM GAMES; STACKELBERG STRATEGY; FEEDBACK; EQUATION;

D O I：

10.1109/MCS.2016.2621461

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Complex human-engineered systems involve an interconnection of multiple decision makers (or agents) whose collective behavior depends on a compilation of local decisions that are based on partial information about each other and the state of the environment [1]-[4]. Strategic interactions among agents in these systems can be modeled as a multiplayer simultaneous-move game [5]-[8]. The agents involved can have conflicting objectives, and it is natural to make decisions based upon optimizing individual payoffs or costs. © 2016 IEEE.

引用

页码：33 / 52

页数：20

共 66 条

[51] Multi-agent team cooperation: A game theory approach [J].

Semsar-Kazerooni, E. ;

Khorasani, K. .

AUTOMATICA, 2009, 45 (10) :2205-2213

[52] On the Stackelberg Strategy in Nonzero-Sum Games [J].

Simaan, M. ;

Cruz, J. B., Jr. .

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1973, 11 (05) :533-555

[53] Additional Aspects of the Stackelberg Strategy in Nonzero-Sum Games [J].

Simaan, M. ;

Cruz, J. B., Jr. .

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1973, 11 (06) :613-626

[54]

Sutton RS, 2018, ADAPT COMPUT MACH LE, P1

[55]

Torrance G W, 1989, Int J Technol Assess Health Care, V5, P559

[56]

Vamvoudakis KG, 2012, IEEE DECIS CONTR P, P1883, DOI 10.1109/CDC.2012.6426969

[57]

Vamvoudakis K.G., 2011, Journal of Artificial Intelligence and Soft Computing Research, V1

[58] Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems [J].

Vamvoudakis, Kyriakos G. .

AUTOMATICA, 2015, 61 :274-281

[59]

Vamvoudakis KG, 2015, P AMER CONTR CONF, P5062, DOI 10.1109/ACC.2015.7172127

[60] Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality [J].

Vamvoudakis, Kyriakos G. ;

Lewis, Frank L. ;

Hudas, Greg R. .

AUTOMATICA, 2012, 48 (08) :1598-1611

← 1 2 3 4 5 6 7 →