Reinforcement learning and non-zero-sum game output regulation for multi-player linear uncertain systems

被引:63
作者
Odekunle, Adedapo [1 ]
Gao, Weinan [1 ]
Davari, Masoud [1 ]
Jiang, Zhong-Ping [2 ]
机构
[1] Georgia Southern Univ, Allen E Paulson Coll Engn & Comp, Dept Elect & Comp Engn, 1100 IT Dr, Statesboro, GA 30460 USA
[2] NYU, Tandon Sch Engn, Dept Elect & Comp Engn, Metrotech Ctr 6, Brooklyn, NY 11201 USA
基金
美国国家科学基金会;
关键词
Reinforcement learning (RL); Adaptive optimal control; Game theory; Output regulation; Data-Driven control; ADAPTIVE OPTIMAL-CONTROL; MULTIAGENT SYSTEMS; NONLINEAR-SYSTEMS;
D O I
10.1016/j.automatica.2019.108672
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper studies the non-zero-sum game output regulation problem (CORP) for a class of continuous-time multi-player linear systems. Without the knowledge of state and input matrices, the Nash equilibrium solution, N-tuple of feedback control policy, is learned through online data collected along the system trajectories. A key strategy is, for the first time, to combine techniques from reinforcement learning (RL), differential game theory, and output regulation for data-driven control design. Different from the existing literature of adaptive optimal output regulation, the feedforward matrices are considered nontrivial. Theoretical analysis shows the disturbance rejection and tracking ability of the closed-loop system. Simulation results demonstrate the efficacy of the developed data-driven control approach. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:8
相关论文
共 33 条
[1]   Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
AUTOMATICA, 2007, 43 (03) :473-481
[2]  
Basar T., 1999, DYNAMIC NONCOOPERATI, V23
[3]   Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design [J].
Bian, Tao ;
Jiang, Zhong-Ping .
AUTOMATICA, 2016, 71 :348-360
[4]   Output regulation of nonlinear systems by sliding mode [J].
Bonivento, C ;
Marconi, L ;
Zanasi, R .
AUTOMATICA, 2001, 37 (04) :535-542
[5]  
Boyd S., 2004, CONVEX OPTIMIZATION
[6]   Adaptive Actor-Critic Design-Based Integral Sliding-Mode Control for Partially Unknown Nonlinear Systems With Input Disturbances [J].
Fan, Quan-Yong ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (01) :165-177
[7]   Leader-to-Formation Stability of Multiagent Systems: An Adaptive Optimal Control Approach [J].
Gao, Weinan ;
Jiang, Zhong-Ping ;
Lewis, Frank L. ;
Wang, Yebin .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (10) :3581-3587
[8]   Learning-Based Adaptive Optimal Tracking Control of Strict-Feedback Nonlinear Systems [J].
Gao, Weinan ;
Jiang, Zhong-Ping .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2614-2624
[9]   Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems [J].
Gao, Weinan ;
Jiang, Zhong-Ping .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) :4164-4169
[10]  
Huang J., 2004, Nonlinear Output Regulation:Theory and Applications