Optimal tracking control of batch processes with time-invariant state delay: Adaptive Q-learning with two-dimensional state and control policy

被引：5

作者：

Shi, Huiyuan ^{[1
,2
]}

Lv, Mengdi ^{[1
]}

Jiang, Xueying ^{[2
]}

Su, Chengli ^{[1
,3
]}

Li, Ping ^{[4
]}

机构：

[1] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China

[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China

[3] Liaodong Univ, Sch Informat Engn, Dandong 118001, Peoples R China

[4] Univ Sci & Technol Liaoning, Sch Elect & Informat Engn, Anshan 114051, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2024年 / 132卷

基金：

中国国家自然科学基金;

关键词：

Time delay; Two-dimensional (2D); Q-learning; Optimal tracking control; Batch process; FAULT-TOLERANT CONTROL; H-INFINITY CONTROL; SYSTEMS; UNCERTAINTIES; DESIGN; INPUT;

D O I：

10.1016/j.engappai.2024.108006

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Given that conventional model-based control methods have some limitations for dynamic systems with unknown model parameters and existing reinforcement learning methods do not take batch and time delay information into account, a novel data-based adaptive Q-learning approach with two-dimensional (2D) state and control policy is proposed to address the optimal tracking control issue for batch processes with time-invariant state delay. The extended delay state space equation, value function, Q function and optimal performance index are initially presented along the time and batch directions. By examining the correlation between the 2D value function and the 2D Q function, a delay-dependent 2D Bellman equation is designed independent of the process model, which is solved to obtain the expression of the control law. Without requiring prior knowledge of the system, the optimal gain matrices of the control law are further learned by using the current and historical state, output error values and time delay information of the timewise and batchwise. It is feasible to achieve accelerated convergence and reduced errors between the optimal control gain matrices and the learning gain matrices, hence enhancing the tracking capabilities of the systems. At the same time, the unbiasedness and convergence of the given adaptive Q-learning approach are strictly proved. The effectiveness of the proposed algorithm is ultimately validated by simulation comparisons of injection molding, specifically regarding the convergence of control gains and the tracking of output.

引用

页数：15

共 36 条

[1] KRONECKER PRODUCTS AND MATRIX CALCULUS IN SYSTEM THEORY [J].

BREWER, JW .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, 1978, 25 (09) :772-781

[2] A Novel Iterative Learning Approach for Tracking Control of High-Speed Trains Subject to Unknown Time-Varying Delay [J].

Chen, Yong ;

Huang, Deqing ;

Li, Yanan ;

Feng, Xiaoyun .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2022, 19 (01) :113-121

[3]

Doostmohammadian M., 2022, IEEE Open J. Control. Syst., V1, P255

[4] Distributed delay-tolerant strategies for equality-constraint sum-preserving resource allocation [J].

Doostmohammadian, Mohammadreza ;

Aghasi, Alireza ;

Vrakopoulou, Maria ;

Rabiee, Hamid R. ;

Khan, Usman A. ;

Charalambous, Themistoklis .

SYSTEMS & CONTROL LETTERS, 2023, 182

[5] Adaptive Optimal Output Regulation of Time-Delay Systems via Measurement Feedback [J].

Gao, Weinan ;

Jiang, Zhong-Ping .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) :938-945

[6] Two-dimensional delay compensation based iterative learning control scheme for batch processes with both input and state delays [J].

Hao, Shoulin .

JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2019, 356 (15) :8118-8137

[7] Improved model-free H∞ control for batch processes via off-policy 2D game Q-learning [J].

Jiang, Xueying ;

Huang, Min ;

Kuang, Hanbin ;

Shi, Huiyuan ;

Wang, Xingwei ;

Lee, Loo Hay .

INTERNATIONAL JOURNAL OF CONTROL, 2023, 96 (10) :2447-2463

[8] TASAC: A twin-actor reinforcement learning framework with a stochastic with an to batch control [J].

Joshi, Tanuja ;

Kodamana, Hariprasad ;

Kandath, Harikumar ;

Kaisare, Niket .

CONTROL ENGINEERING PRACTICE, 2023, 134

[9] Delay-dependent robust H∞ control for uncertain systems with a state-delay [J].

Lee, YS ;

Moon, YS ;

Kwon, WH ;

Park, PG .

AUTOMATICA, 2004, 40 (01) :65-72

[10] Two-Dimensional Iterative Learning Robust Asynchronous Switching Predictive Control for Multiphase Batch Processes With Time-Varying Delays [J].

Li, Hui ;

Wang, Shiqi ;

Shi, Huiyuan ;

Su, Chengli ;

Li, Ping .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (10) :6488-6502

← 1 2 3 4 →