Hierarchical optimal control for input-affine nonlinear systems through the formulation of Stackelberg game

被引：30

作者：

Mu, Chaoxu ^{[1
]}

Wang, Ke ^{[1
]}

Zhang, Qichao ^{[2
,3
]}

Zhao, Dongbin ^{[2
,3
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

来源：

INFORMATION SCIENCES | 2020年 / 517卷

基金：

中国国家自然科学基金;

关键词：

Nonzero-sum differential game; Hierarchical optimization; Nonlinear dynamics; Stackelberg equilibrium; Neural network; STRATEGIES; ALGORITHM;

D O I：

10.1016/j.ins.2019.12.078

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Substantial efforts have been undertaken to explore nonzero-sum differential games. Most of these studies are devoted to devising algorithms to pursue Nash equilibrium, where all players with the same access to information will take policies synchronously. However, when it comes to hierarchical optimization and asymmetric information, Nash equilibrium is ineffective. The Stackelberg game provides us with an idea of leader-follower strategy to cope with this conundrum. The paper investigates hierarchical optimal control for continuous-time two-player input-affine systems characterized by nonlinear dynamics and quadratic cost functions. By introducing new costates, this optimization problem is formulated as a Stackelberg game in conjunction with a parametric optimization problem. Besides, the closed-loop information is available for both players. An adaptive learning algorithm is thus developed to approximately obtain the open-loop Stackelberg equilibrium while ensuring the uniform ultimate bounded stability of this closed-loop system, and two approximators structured by neural networks put this purpose into practice. Finally, two numerical examples illustrate that the proposed methodology can accurately obtain optimal solutions, and a comparative example illustrates its characteristics. (C) 2020 Elsevier Inc. All rights reserved.

引用

页码：1 / 17

页数：17

共 39 条

[1] ANALYTICAL SOLUTION FOR AN OPEN-LOOP STACKELBERG GAME [J].

ABOUKANDIL, H ;

BERTRAND, P .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1985, 30 (12) :1222-1224

[2] H∞-Constrained Incentive Stackelberg Game for Discrete-Time Systems with Multiple Non-cooperative Followers [J].

Ahmed, Mostak ;

Mukaidani, Hiroaki .

IFAC PAPERSONLINE, 2016, 49 (22) :262-267

[3] TEAM-OPTIMAL CLOSED-LOOP STACKELBERG STRATEGIES IN HIERARCHICAL CONTROL-PROBLEMS [J].

BASAR, T ;

OLSDER, GJ .

AUTOMATICA, 1980, 16 (04) :409-414

[4]

Basar T., 1998, Dynamic Noncooperative Game Theory

[5] Multi-task learning for dangerous object detection in autonomous driving [J].

Chen, Yaran ;

Zhao, Dongbin ;

Lv, Le ;

Zhang, Qichao .

INFORMATION SCIENCES, 2018, 432 :559-571

[6] LEADER-FOLLOWER STRATEGIES FOR MULTILEVEL SYSTEMS [J].

CRUZ, JB .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1978, 23 (02) :244-255

[7] Adaptive near-optimal neuro controller for continuous-time nonaffine nonlinear systems with constrained input [J].

Esfandiari, Kasra ;

Abdollahi, Farzaneh ;

Talebi, Heidar Ali .

NEURAL NETWORKS, 2017, 93 :195-204

[8] Online Solution of Two-Player Zero-Sum Games for Continuous-Time Nonlinear Systems With Completely Unknown Dynamics [J].

Fu, Yue ;

Chai, Tianyou .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (12) :2577-2587

[9] A three-network architecture for on-line learning and optimization based on adaptive dynamic programming [J].

He, Haibo ;

Ni, Zhen ;

Fu, Jian .

NEUROCOMPUTING, 2012, 78 (01) :3-13

[10] Adaptive Neural Network Control of a Robotic Manipulator With Time-Varying Output Constraints [J].

He, Wei ;

Huang, Haifeng ;

Ge, Shuzhi Sam .

IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) :3136-3147

← 1 2 3 4 →