Nash Equilibrium Seeking in Noncooperative Games

被引：278

作者：

Frihauf, Paul ^{[1
]}

Krstic, Miroslav ^{[1
]}

Basar, Tamer ^{[2
]}

机构：

[1] Univ Calif San Diego, Dept Mech & Aerosp Engn, La Jolla, CA 92093 USA

[2] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2012年 / 57卷 / 05期

基金：

美国国家科学基金会;

关键词：

Extremum seeking; learning; Nash equilibria; noncooperative games; EXTREMUM-SEEKING; RESPONSE DYNAMICS; STABILITY; NETWORKS; DESIGN; FLOW;

D O I：

10.1109/TAC.2011.2173412

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We introduce a non-model based approach for locally stable convergence to Nash equilibria in static, noncooperative games with N players. In classical game theory algorithms, each player employs the knowledge of the functional form of his payoff and the knowledge of the other players' actions, whereas in the proposed algorithm, the players need to measure only their own payoff values. This strategy is based on the extremum seeking approach, which has previously been developed for standard optimization problems and employs sinusoidal perturbations to estimate the gradient. We consider static games with quadratic payoff functions before generalizing our results to games with non-quadratic payoff functions that are the output of a dynamic system. Specifically, we consider general nonlinear differential equations with N inputs and N outputs, where in the steady state, the output signals represent the payoff functions of a noncooperative game played by the steady-state values of the input signals. We employ the standard local averaging theory and obtain local convergence results for both quadratic payoffs, where the actual convergence is semi-global, and non-quadratic payoffs, where the potential existence of multiple Nash equilibria precludes semi-global convergence. Our convergence conditions coincide with conditions that arise in model-based Nash equilibrium seeking. However, in our framework the user is not meant to check these conditions because the payoff functions are presumed to be unknown. For non-quadratic payoffs, convergence to a Nash equilibrium is not perfect, but is biased in proportion to the perturbation amplitudes and the higher derivatives of the payoff functions. We quantify the size of these residual biases and confirm their existence numerically in an example noncooperative game. In this example, we present the first application of extremum seeking with projection to ensure that the players' actions remain in a given closed and bounded action set.

引用

页码：1192 / 1207

页数：16

共 51 条

[21]

Hofbauer J, 2006, DISCRETE CONT DYN-B, V6, P215

[22]

Hopkins E, 1999, GAME ECON BEHAV, V29, P138, DOI 10.1006/game.1999.0636

[23]

Horn R.A., 2012, Matrix Analysis

[24]

Jafari Amir, 2001, ICML, V1, P226

[25]

Khalil H., 2002, Control of Nonlinear Systems

[26] HCCI Engine Combustion-Timing Control: Optimizing Gains and Fuel Consumption Via Extremum Seeking [J].

Killingsworth, Nick J. ;

Aceves, Salvador M. ;

Flowers, Daniel L. ;

Espinosa-Loza, Francisco ;

Krstic, Miroslav .

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2009, 17 (06) :1350-1361

[27]

Krstic M., 2010, IFAC P, V43, P1086

[28] DISTRIBUTED ALGORITHMS FOR THE COMPUTATION OF NONCOOPERATIVE EQUILIBRIA [J].

LI, S ;

BASAR, T .

AUTOMATICA, 1987, 23 (04) :523-533

[29] COMPLETE STABILITY OF NONCOOPERATIVE GAMES [J].

LUENBERGER, DG .

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1978, 25 (04) :485-505

[30] Mixing Enhancement in 2D Magnetohydrodynamic Channel Flow by Extremum Seeking Boundary Control [J].

Luo, Lixiang ;

Schuster, Eugenio .

2009 AMERICAN CONTROL CONFERENCE, VOLS 1-9, 2009, :1530-1535

← 1 2 3 4 5 6 →