Off-policy two-dimensional reinforcement learning for optimal tracking control of batch processes with network-induced dropout and disturbances

被引：4

作者：

Jiang, Xueying ^{[1
,2
]}

Huang, Min ^{[1
,2
]}

Shi, Huiyuan ^{[2
,3
]}

Wang, Xingwei ^{[4
]}

Zhang, Yanfeng ^{[4
]}

机构：

[1] Northeastern Univ, Coll Informat Sci & Engn, Qinhuangdao, Peoples R China

[2] Northeastern Univ, State Key Lab Synthet Automa Proc Ind, Qinhuangdao, Peoples R China

[3] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun, Peoples R China

[4] Northeastern Univ, Coll Comp Sci & Engn, Fushun, Peoples R China

来源：

ISA TRANSACTIONS | 2024年 / 144卷

关键词：

Two-dimensional (2D); Reinforcement learning; Optimal tracking control; Batch processes; Network -induced dropout; Injection velocity; TIME-SYSTEMS; GAMES;

D O I：

10.1016/j.isatra.2023.11.011

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a new off-policy two-dimensional (2D) reinforcement learning approach is proposed to deal with the optimal tracking control (OTC) issue of batch processes with network-induced dropout and disturbances. A dropout 2D augmented Smith predictor is first devised to estimate the present extended state utilizing past data of time and batch orientations. The dropout 2D value function and Q-function are further defined, and their relation is analyzed to meet the optimal performance. On this basis, the dropout 2D Bellman equation is derived according to the principle of the Q-function. For the sake of addressing the dropout 2D OTC problem of batch processes, two algorithms, i.e., the off-line 2D policy iteration algorithm and the off-policy 2D Q-learning algorithm, are presented. The latter method is developed by applying only the input and the estimated state, not the underlying information of the system. Meanwhile, the analysis with regard to the unbiasedness of solutions and convergence is separately given. The effectiveness of the provided methodologies is eventually validated through the application of a simulated case during the filling process.

引用

页码：228 / 244

页数：17

共 41 条

[1] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :473-481

[2]

[Anonymous], 2019, CONSTRAINED OPTIMIZA

[3] Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems [J].

Chen, Ci ;

Lewis, Frank L. ;

Xie, Kan ;

Xie, Shengli ;

Liu, Yilu .

AUTOMATICA, 2020, 119

[4] Model-Free Optimal Output Regulation for Linear Discrete-Time Lossy Networked Control Systems [J].

Fan, Jialu ;

Wu, Qian ;

Jiang, Yi ;

Chai, Tianyou ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11) :4033-4042

[5] Iterative Learning Fault-Tolerant Control for Networked Batch Processes with Multirate Sampling and Quantization Effects [J].

Gao, Ming ;

Sheng, Li ;

Zhou, Donghua ;

Gao, Furong .

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2017, 56 (09) :2515-2525

[6] A reinforcement learning decision model for online process parameters optimization from offline data in injection molding [J].

Guo, Fei ;

Zhou, Xiaowei ;

Liu, Jiahuan ;

Zhang, Yun ;

Li, Dequn ;

Zhou, Huamin .

APPLIED SOFT COMPUTING, 2019, 85

[7] H∞ optimal control of unknown linear systems by adaptive dynamic programming with applications to time-delay systems [J].

Jiang, Huai-Yuan ;

Zhou, Bin ;

Liu, Guo-Ping .

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (12) :5602-5617

[8] Adaptive Optimal Control of Networked Nonlinear Systems With Stochastic Sensor and Actuator Dropouts Based on Reinforcement Learning [J].

Jiang, Yi ;

Liu, Lu ;

Feng, Gang .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) :3107-3120

[9] Optimal Output Regulation of Linear Discrete-Time Systems With Unknown Dynamics Using Reinforcement Learning [J].

Jiang, Yi ;

Kiumarsi, Bahare ;

Fan, Jialu ;

Chai, Tianyou ;

Li, Jinna ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (07) :3147-3156

[10] Tracking Control for Linear Discrete-Time Networked Control Systems With Unknown Dynamics and Dropout [J].

Jiang, Yi ;

Fan, Jialu ;

Chai, Tianyou ;

Lewis, Frank L. ;

Li, Jinna .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (10) :4607-4620

← 1 2 3 4 5 →