Reinforcement learning approach to autonomous PID tuning

被引:84
作者
Dogru, Oguzhan
Velswamy, Kirubakaran
Ibrahim, Fadi
Wu, Yuqi
Sundaramoorthy, Arun Senthil
Huang, Biao
Xu, Shu
Nixon, Mark
Bell, Noel
机构
[1] Department of Chemical and Materials Engineering, University of Alberta, Edmonton, T6G 1H9, AB
[2] Department of Electrical and Electronics Engineering, University of Alberta, Edmonton, T6G 1H9, AB
[3] Emerson Electric Co., Austin, 78681, TX
基金
加拿大自然科学与工程研究理事会;
关键词
Contextual bandits; PID tuning; Process control; Reinforcement learning; Step -response model; CASCADE CONTROL; CONTROLLERS; IMPROVE; SINGLE; SYSTEM;
D O I
10.1016/j.compchemeng.2022.107760
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Many industrial processes utilize proportional-integral-derivative (PID) controllers due to their practicality and often satisfactory performance. The proper controller parameters depend highly on the operational conditions and process uncertainties. This study combines the recent developments in computer sciences and control theory to address the tuning problem. It formulates the PID tuning problem as a reinforcement learning task with constraints. The proposed scheme identifies an initial approximate step-response model and lets the agent learn dynamics off-line from the model with minimal effort. After achieving a satisfactory training performance on the model, the agent is fine-tuned on-line on the actual process to adapt to the real dynamics, thereby minimizing the training time on the real process and avoiding unnecessary wear, which can be beneficial for industrial applications. This sample efficient method is tested and demonstrated through a pilot-scale multi-modal tank system. The performance of the method is verified through setpoint tracking and disturbance regulatory experiments. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:17
相关论文
共 86 条
[1]  
Altman Eitan, 1999, Constrained Markov Decision Processes, V7
[2]  
Astrm K.J., 2006, Advanced PID Control, V461
[3]  
Astrom K.J., 2002, CONTROL SYSTEM DESIG
[4]   Revisiting the Ziegler-Nichols step response method for PID control [J].
Åstrom, KJ ;
Hågglund, T .
JOURNAL OF PROCESS CONTROL, 2004, 14 (06) :635-650
[5]  
ASTROM KJ, 1984, AUTOMATICA, V20, P645, DOI 10.1016/0005-1098(84)90014-1
[6]   A Deep Reinforcement Learning Approach to Improve the Learning Performance in Process Control [J].
Bao, Yaoyao ;
Zhu, Yuanming ;
Qian, Feng .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2021, 60 (15) :5504-5515
[7]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[8]  
Bequette W.B., 2003, Process Control: Modeling, Design, and Simulation
[9]   An experimental comparison of PID autotuners [J].
Berner, Josefin ;
Soltesz, Kristian ;
Hagglund, Tore ;
Astrom, Karl Johan .
CONTROL ENGINEERING PRACTICE, 2018, 73 :124-133
[10]  
Bertsekas D. P., 2019, algorithm for optimal control with integral reinforcement learn