Data-driven approximate optimal tracking control schemes for unknown non-affine non-linear multi-player systems via adaptive dynamic programming

被引:12
作者
Jiang, He [1 ]
Luo, Yanhong [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
optimal control; nonlinear systems; game theory; dynamic programming; learning (artificial intelligence); neural nets; least squares approximations; data-driven approximate optimal tracking control; nonaffine nonlinear multiplayer systems; adaptive dynamic programming; optimal control theory; ADP; nonzero-sum games; Hamilton-Jacobi equations; affine system; data-driven Q-learning approach; neural networks; NN weights; least-square form; ZERO-SUM GAMES;
D O I
10.1049/el.2016.4756
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Game theory, optimal control theory and adaptive dynamic programming (ADP) to deal with the optimal tracking control issue for unknown non-affine multi-player systems are integrated. It is known that non-zero-sum games of non-linear multi-player systems rely on solving a set of Hamilton-Jacobi equations, which are generally difficult to be computed analytically due to the non-linear nature. Traditional ADP methods require the knowledge of accurate system models, and only consider the simple affine system version. A novel data-driven Q-learning approach, which only needs the measurable data generated by running systems, is proposed. To implement this method, neural networks (NNs) are utilised, and NN weights are updated through the least-square form.
引用
收藏
页码:465 / 467
页数:2
相关论文
共 7 条
[1]   Online finite-horizon optimal learning algorithm for nonzero-sum games with partially unknown dynamics and constrained inputs [J].
Cui, Xiaohong ;
Zhang, Huaguang ;
Luo, Yanhong ;
Zu, Peifu .
NEUROCOMPUTING, 2016, 185 :37-44
[2]   Online Synchronous Approximate Optimal Learning Algorithm for Multiplayer Nonzero-Sum Games With Unknown Dynamics [J].
Liu, Derong ;
Li, Hongliang ;
Wang, Ding .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2014, 44 (08) :1015-1027
[3]   Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations [J].
Vamvoudakis, Kyriakos G. ;
Lewis, Frank L. .
AUTOMATICA, 2011, 47 (08) :1556-1569
[4]   Reinforcement learning and neural networks for multi-agent nonzero-sum games of nonlinear constrained-input systems [J].
Yasini, Sholeh ;
Sitani, Mohammad Bagher Naghibi ;
Kirampor, Ali .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2016, 7 (06) :967-980
[5]   Optimal Planning of PEV Charging Station With Single Output Multiple Cables Charging Spots [J].
Zhang, Hongcai ;
Hu, Zechun ;
Xu, Zhiwei ;
Song, Yonghua .
IEEE TRANSACTIONS ON SMART GRID, 2017, 8 (05) :2119-2128
[6]   Near-Optimal Control for Nonzero-Sum Differential Games of Continuous-Time Nonlinear Systems Using Single-Network ADP [J].
Zhang, Huaguang ;
Cui, Lili ;
Luo, Yanhong .
IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (01) :206-216
[7]   Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics [J].
Zhao, Dongbin ;
Zhang, Qichao ;
Wang, Ding ;
Zhu, Yuanheng .
IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (03) :854-865