Mirror Descent Learning in Continuous Games

被引:0
作者
Zhou, Zhengyuan [1 ,2 ]
Mertikopoulos, Panayotis [3 ]
Moustakas, Aris L. [4 ,5 ]
Bambos, Nicholas [1 ,2 ]
Glynn, Peter [1 ,2 ]
机构
[1] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Management Sci & Engn, Stanford, CA 94305 USA
[3] Univ Grenoble Alpes, CNRS, Grenoble INP, INRIA,LIG, F-38000 Grenoble, France
[4] Univ Athens, Dept Phys, Athens, Greece
[5] IASA, Athens, Greece
来源
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC) | 2017年
关键词
OPTIMIZATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online Mirror Descent (OMD) is an important and widely used class of adaptive learning algorithms that enjoys good regret performance guarantees. It is therefore natural to study the evolution of the joint action in a multi-agent decision process (typically modeled as a repeated game) where every agent employs an OMD algorithm. This well-motivated question has received much attention in the literature that lies at the intersection between learning and games. However, much of the existing literature has been focused on the time average of the joint iterates. In this paper, we tackle a harder problem that is of practical utility, particularly in the online decision making setting: the convergence of the last iterate when all the agents make decisions according to OMD. We introduce an equilibrium stability notion called variational stability (VS) and show that in variationally stable games, the last iterate of OMD converges to the set of Nash equilibria. We also extend the OMD learning dynamics to a more general setting where the exact gradient is not available and show that the last iterate (now random) of OMD converges to the set of Nash equilibria almost surely.
引用
收藏
页数:8
相关论文
共 29 条
[1]  
[Anonymous], 2013, INT C MACH LEARN PML
[2]  
[Anonymous], 2007, Advances in Neural Information Processing Systems
[3]  
[Anonymous], ARXIV160601261
[4]  
Benaïm M, 1999, LECT NOTES MATH, V1709, P1
[5]  
Blum A, 2007, J MACH LEARN RES, V8, P1307
[6]  
Cesa-Bianchi N., 2006, PREDICTION LEARNING
[8]  
Facchinei F., FINITE DIMENSIONAL V
[9]   Generalized Nash equilibrium problems [J].
Facchinei, Francisco ;
Kanzow, Christian .
4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH, 2007, 5 (03) :173-210
[10]  
Hjrungnes A., 2011, Game Theory in Wireless and Communication Networks: Theory, Models, and Applications