Off-policy two-dimensional reinforcement learning for optimal tracking control of batch processes with network-induced dropout and disturbances

被引:2
|
作者
Jiang, Xueying [1 ,2 ]
Huang, Min [1 ,2 ]
Shi, Huiyuan [2 ,3 ]
Wang, Xingwei [4 ]
Zhang, Yanfeng [4 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Qinhuangdao, Peoples R China
[2] Northeastern Univ, State Key Lab Synthet Automa Proc Ind, Qinhuangdao, Peoples R China
[3] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun, Peoples R China
[4] Northeastern Univ, Coll Comp Sci & Engn, Fushun, Peoples R China
关键词
Two-dimensional (2D); Reinforcement learning; Optimal tracking control; Batch processes; Network -induced dropout; Injection velocity; TIME-SYSTEMS; GAMES;
D O I
10.1016/j.isatra.2023.11.011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a new off-policy two-dimensional (2D) reinforcement learning approach is proposed to deal with the optimal tracking control (OTC) issue of batch processes with network-induced dropout and disturbances. A dropout 2D augmented Smith predictor is first devised to estimate the present extended state utilizing past data of time and batch orientations. The dropout 2D value function and Q-function are further defined, and their relation is analyzed to meet the optimal performance. On this basis, the dropout 2D Bellman equation is derived according to the principle of the Q-function. For the sake of addressing the dropout 2D OTC problem of batch processes, two algorithms, i.e., the off-line 2D policy iteration algorithm and the off-policy 2D Q-learning algorithm, are presented. The latter method is developed by applying only the input and the estimated state, not the underlying information of the system. Meanwhile, the analysis with regard to the unbiasedness of solutions and convergence is separately given. The effectiveness of the provided methodologies is eventually validated through the application of a simulated case during the filling process.
引用
收藏
页码:228 / 244
页数:17
相关论文
共 34 条
  • [21] H∞ Optimal Control of Unknown Linear Discrete-time Systems: An Off-policy Reinforcement Learning Approach
    Kiumarsi, Bahare
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 41 - 46
  • [22] Improved model-free H∞ control for batch processes via off-policy 2D game Q-learning
    Jiang, Xueying
    Huang, Min
    Kuang, Hanbin
    Shi, Huiyuan
    Wang, Xingwei
    Lee, Loo Hay
    INTERNATIONAL JOURNAL OF CONTROL, 2023, 96 (10) : 2447 - 2463
  • [23] Optimal Control of Two-Dimensional Roesser Model: Solution Based on Reinforcement Learning
    Ye, Linwei
    Zhao, Zhonggai
    Liu, Fei
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (08) : 5424 - 5430
  • [24] Model-free H∞ tracking control for de-oiling hydrocyclone systems via off-policy reinforcement learning
    Li, Shaobao
    Durdevic, Petar
    Yang, Zhenyu
    AUTOMATICA, 2021, 133
  • [25] Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (11) : 2401 - 2410
  • [26] Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics
    Wen, Xin
    Shi, Huiyuan
    Su, Chengli
    Jiang, Xueying
    Li, Ping
    Yu, Jingxian
    ISA TRANSACTIONS, 2022, 125 : 10 - 21
  • [27] Bi-Level Off-Policy Reinforcement Learning for Two-Timescale Volt/VAR Control in Active Distribution Networks
    Liu, Haotian
    Wu, Wenchuan
    Wang, Yao
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2023, 38 (01) : 385 - 395
  • [28] A just-in-time-learning based two-dimensional control strategy for nonlinear batch processes
    Zhou, Liuming
    Jia, Li
    Wang, Yu-Long
    INFORMATION SCIENCES, 2020, 507 : 220 - 239
  • [29] A two-dimensional model predictive iterative learning control based on the set point learning strategy for batch processes
    Li, Haisheng
    Bai, Jianjun
    Zou, Hongbo
    Yin, Xunyuan
    Zhang, Ridong
    JOURNAL OF PROCESS CONTROL, 2024, 133
  • [30] Two-dimensional model predictive iterative learning control based on just-in-time learning method for batch processes
    Zheng, Chuangkai
    Zhou, Liuming
    Li, Feng
    2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 1353 - 1358