Optimal output tracking control of linear discrete-time systems with unknown dynamics by adaptive dynamic programming and output feedback

被引：9

作者：

Cai, Xuan ^{[1
]}

Wang, Chaoli ^{[2
]}

Liu, Shuxin ^{[1
]}

Chen, Guochu ^{[1
]}

Wang, Gang ^{[3
]}

机构：

[1] Shanghai Dianji Univ, Sch Elect Engn, Shanghai 201306, Peoples R China

[2] Univ Shanghai Sci & Technol, Dept Control Sci & Engn, Shanghai, Peoples R China

[3] Univ Shanghai Sci & Technol, Inst Machine Intelligence, Shanghai, Peoples R China

来源：

INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE | 2022年 / 53卷 / 16期

基金：

中国国家自然科学基金;

关键词：

Adaptive dynamic programming; data-driven control; optimal tracking control; optimal output regulation; output-feedback; discrete-time systems; VALUE-ITERATION;

D O I：

10.1080/00207721.2022.2085343

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper discusses the output-feedback-based model-free optimal output tracking control problem of discrete-time systems with completely unknown system models under mild assumptions. The only information that allows utilisation is the system output and the reference output. To overcome these challenges, the paper aims to solve a model-free optimal output regulation problem for achieving optimal output tracking control; solving an optimal output regulation problem is equivalent to solving a linear quadratic regulation (LQR) problem and a constrained static optimisation problem. The state reconstruction is first given to represent the system state in terms of the input and output sequences. A data-driven value iteration (VI) algorithm is then proposed to iteratively approximate the solution to the discrete-time algebraic Riccati equation (ARE) of the corresponding LQR problem on the basis of input and output data. Next, based on the iterative solutions with respect to the ARE, a model-free solution is provided for the corresponding constrained static optimisation problem. Finally, an alternative reference system equivalent to the original reference system is established to avoid the requirement of having the knowledge of the reference system dynamics and the reference state. A numerical example is employed to demonstrate the effectiveness of the proposed control scheme.

引用

页码：3426 / 3448

页数：23

共 47 条

[1] A Modified Stationary Reference Frame-Based Predictive Current Control With Zero Steady-State Error for LCL Coupled Inverter-Based Distributed Generation Systems [J].

Ahmed, Khaled H. ;

Massoud, Ahmed M. ;

Finney, Stephen J. ;

Williams, Barry W. .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2011, 58 (04) :1359-1370

[2] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :473-481

[3] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank .

2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, :38-+

[4]

[Anonymous], 2012, Calculus of Variations and Optimal Control Theory: A Concise Introduction

[5] Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation [J].

Beard, RW ;

Saridis, GN ;

Wen, JT .

AUTOMATICA, 1997, 33 (12) :2159-2177

[6]

Bellman R. E., 1961, Adaptive control processes: A guided tour, DOI DOI 10.1515/9781400874668

[7]

Bertsekas D. P., 2017, Dynamic programming and optimal control, VI

[8]

Bertsekas D. P., 1996, Neuro-Dynamic Programming

[9] Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming [J].

Bertsekas, Dimitri P. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) :500-509

[10] Reinforcement Learning and Adaptive Optimal Control for Continuous-Time Nonlinear Systems: A Value Iteration Approach [J].

Bian, Tao ;

Jiang, Zhong-Ping .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (07) :2781-2790

← 1 2 3 4 5 →