Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics

被引:18
|
作者
Wen, Xin [1 ]
Shi, Huiyuan [1 ,2 ,3 ]
Su, Chengli [1 ,4 ,7 ]
Jiang, Xueying [5 ]
Li, Ping [1 ,4 ]
Yu, Jingxian [6 ]
机构
[1] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun, Peoples R China
[2] Northwestern Polytech Univ, Sch Automat, Xian, Peoples R China
[3] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China
[4] Univ Sci & Technol Liaoning, Sch Elect & Informat Engn, Anshan, Peoples R China
[5] Northeastern Univ, Sch Informat Sci & Engn, Shenyang, Peoples R China
[6] Liaoning Petrochem Univ, Sch Sci, Fushun, Peoples R China
[7] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China
基金
中国国家自然科学基金;
关键词
Batchprocess; Data-driven; 2Doff-policyQ-learning; Optimaltrackingcontrol; Injectionmolding; MODEL PREDICTIVE CONTROL; FAULT-TOLERANT CONTROL; STATE DELAY; DESIGN; FEEDBACK;
D O I
10.1016/j.isatra.2021.06.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In view that the previous control methods usually rely too much on the models of batch process and have difficulty in a practical batch process with unknown dynamics, a novel data-driven twodimensional (2D) off-policy Q-learning approach for optimal tracking control (OTC) is proposed to make the batch process obtain a model-free control law. Firstly, an extended state space equation composing of the state and output error is established for ensuring tracking performance of the designed controller. Secondly, the behavior policy of generating data and the target policy of optimization as well as learning is introduced based on this extended system. Then, the Bellman equation independent of model parameters is given via analyzing the relation between 2D value function and 2D Q-function. The measured data along the batch and time directions of batch process are just taken to carry out the policy iteration, which can figure out the optimal control problem despite lacking systematic dynamic information. The unbiasedness and convergence of the designed 2D off-policy Q-learning algorithm are proved. Finally, a simulation case for injection molding process manifests that control effect and tracking effect gradually become better with the increasing number of batches.(c) 2021 ISA. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:10 / 21
页数:12
相关论文
共 50 条
  • [41] Model-Free Optimal Tracking Control via Critic-Only Q-Learning
    Luo, Biao
    Liu, Derong
    Huang, Tingwen
    Wang, Ding
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (10) : 2134 - 2144
  • [42] Data-Driven Optimal Control for Municipal Solid Waste Incineration Process
    Sun, Jian
    Meng, Xi
    Qiao, Junfei
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (12) : 11444 - 11454
  • [43] Novel data-driven optimal control methods for cost-effective brine treatment
    Kaddoura, Mustafa F.
    Wright, Natasha C.
    DESALINATION, 2024, 578
  • [44] Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method
    Zhang, Huaguang
    Cui, Lili
    Zhang, Xin
    Luo, Yanhong
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (12): : 2226 - 2236
  • [45] Nonlinear neuro-optimal tracking control via stable iterative Q-learning algorithm
    Wei, Qinglai
    Song, Ruizhuo
    Sun, Qiuye
    NEUROCOMPUTING, 2015, 168 : 520 - 528
  • [46] Data-driven model-free slip control of anti-lock braking systems using reinforcement Q-learning
    Radac, Mircea-Bogdan
    Precup, Radu-Emil
    NEUROCOMPUTING, 2018, 275 : 317 - 329
  • [47] Data-driven Neuro-optimal Tracking Control of Ozone Generation Process Based on Adaptive Dynamic Programming
    Dong, Zhe
    Liu, Wenjuan
    Li, Yueheng
    Han, Jie
    Chen, Mengjiao
    2017 6TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS (DDCLS), 2017, : 650 - 655
  • [48] Data-Driven Optimal Control for a Class of Unknown Continuous-Time Nonlinear System Using a Novel ADP Method
    Zhang, Kun
    Zhang, Huaguang
    Jiang, He
    Liu, Chong
    2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 117 - 124
  • [49] Dimensional Error Compensation Based on Data-Driven Sliding Mode Terminal Iterative Learning Control for CNC Batch Grinding
    Chen, Tiantian
    Tian, Xincheng
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [50] Adaptive data-driven design of fault-tolerant control systems with unknown dynamics
    Chen, Wenli
    Li, Xiaojian
    JOURNAL OF PROCESS CONTROL, 2025, 146