Off-policy two-dimensional reinforcement learning for optimal tracking control of batch processes with network-induced dropout and disturbances

被引：2

作者：

Jiang, Xueying ^{[1
,2
]}

Huang, Min ^{[1
,2
]}

Shi, Huiyuan ^{[2
,3
]}

Wang, Xingwei ^{[4
]}

Zhang, Yanfeng ^{[4
]}

机构：

[1] Northeastern Univ, Coll Informat Sci & Engn, Qinhuangdao, Peoples R China

[2] Northeastern Univ, State Key Lab Synthet Automa Proc Ind, Qinhuangdao, Peoples R China

[3] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun, Peoples R China

[4] Northeastern Univ, Coll Comp Sci & Engn, Fushun, Peoples R China

来源：

ISA TRANSACTIONS | 2024年 / 144卷

关键词：

Two-dimensional (2D); Reinforcement learning; Optimal tracking control; Batch processes; Network -induced dropout; Injection velocity; TIME-SYSTEMS; GAMES;

D O I：

10.1016/j.isatra.2023.11.011

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a new off-policy two-dimensional (2D) reinforcement learning approach is proposed to deal with the optimal tracking control (OTC) issue of batch processes with network-induced dropout and disturbances. A dropout 2D augmented Smith predictor is first devised to estimate the present extended state utilizing past data of time and batch orientations. The dropout 2D value function and Q-function are further defined, and their relation is analyzed to meet the optimal performance. On this basis, the dropout 2D Bellman equation is derived according to the principle of the Q-function. For the sake of addressing the dropout 2D OTC problem of batch processes, two algorithms, i.e., the off-line 2D policy iteration algorithm and the off-policy 2D Q-learning algorithm, are presented. The latter method is developed by applying only the input and the estimated state, not the underlying information of the system. Meanwhile, the analysis with regard to the unbiasedness of solutions and convergence is separately given. The effectiveness of the provided methodologies is eventually validated through the application of a simulated case during the filling process.

引用

页码：228 / 244

页数：17

共 34 条

[21] H∞ Optimal Control of Unknown Linear Discrete-time Systems: An Off-policy Reinforcement Learning Approach
Kiumarsi, Bahare
Modares, Hamidreza
Lewis, Frank L.
Jiang, Zhong-Ping
PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 41 - 46
[22] Improved model-free H∞ control for batch processes via off-policy 2D game Q-learning
Jiang, Xueying
Huang, Min
Kuang, Hanbin
Shi, Huiyuan
Wang, Xingwei
Lee, Loo Hay
INTERNATIONAL JOURNAL OF CONTROL, 2023, 96 (10) : 2447 - 2463
[23] Optimal Control of Two-Dimensional Roesser Model: Solution Based on Reinforcement Learning
Ye, Linwei
Zhao, Zhonggai
Liu, Fei
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (08) : 5424 - 5430
[24] Model-free H∞ tracking control for de-oiling hydrocyclone systems via off-policy reinforcement learning
Li, Shaobao
Durdevic, Petar
Yang, Zhenyu
AUTOMATICA, 2021, 133
[25] Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning
Modares, Hamidreza
Lewis, Frank L.
Jiang, Zhong-Ping
IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (11) : 2401 - 2410
[26] Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics
Wen, Xin
Shi, Huiyuan
Su, Chengli
Jiang, Xueying
Li, Ping
Yu, Jingxian
ISA TRANSACTIONS, 2022, 125 : 10 - 21
[27] Bi-Level Off-Policy Reinforcement Learning for Two-Timescale Volt/VAR Control in Active Distribution Networks
Liu, Haotian
Wu, Wenchuan
Wang, Yao
IEEE TRANSACTIONS ON POWER SYSTEMS, 2023, 38 (01) : 385 - 395
[28] A just-in-time-learning based two-dimensional control strategy for nonlinear batch processes
Zhou, Liuming
Jia, Li
Wang, Yu-Long
INFORMATION SCIENCES, 2020, 507 : 220 - 239
[29] A two-dimensional model predictive iterative learning control based on the set point learning strategy for batch processes
Li, Haisheng
Bai, Jianjun
Zou, Hongbo
Yin, Xunyuan
Zhang, Ridong
JOURNAL OF PROCESS CONTROL, 2024, 133
[30] Two-dimensional model predictive iterative learning control based on just-in-time learning method for batch processes
Zheng, Chuangkai
Zhou, Liuming
Li, Feng
2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 1353 - 1358

← 1 2 3 4 →