Simulation Study of Reward Function to Improve the Performance of Chemical Process Control based on Reinforcement Learning

被引:0
作者
Park J. [1 ]
Shim J.H. [1 ]
Lee J.M. [1 ]
机构
[1] School of Chemical and Biological Engineering, Seoul National University
关键词
Chemical process control; model predictive control; reinforcement learning; Twin delayed deep deterministic policy gradient;
D O I
10.5302/J.ICROS.2022.22.0152
中图分类号
学科分类号
摘要
Process control is vital for operating chemical process safely and efficiently while maintaining product quality. Process control has been implemented by utilizing traditional techniques such as PID (Proportional-Integrate-Derivative) control and MPC (Model Predictive Control). However, these approaches have disadvantages when the PID tuning rules do not provide optimal tuning parameters for various operating scenarios or when the model fails to match the actual plant. RL (Reinforcement Learning) is one of the most prominent data-driven techniques for addressing such issues and has gained popularity in recent years. While RL has been extensively studied and succeeds in controlling chemical processes, the reward function has a significant impact on its performance. Therefore, it is essential to specify the reward function in order to efficiently control the process. In this study, we propose a reward function to enhance the performance of RL and compare the control performances of MPC and RLs with four distinct reward functions. The controllers track the setpoint of the product in the Van de Vusse process and are evaluated based on the deviation between the setpoint and output. RL with the proposed reward function outperformed the other controllers. Its performance was 12% greater than that of MPC and 6-30% greater than other RL controllers. In chemical processes, the control performance of RL is enhanced by incorporating the time term and positive reward in the reward function, thus outperforming conventional control approaches. © ICROS 2022.
引用
收藏
页码:1185 / 1190
页数:5
相关论文
共 25 条
[1]  
Seborg D.E., Edgar T.F., Mellichamp D.A.F.J., III, Process Dynamics, John Wiley & Sons, (2016)
[2]  
Ang K.H., Chong G., Li Y., PID control system analysis, design, and technology, IEEE Transactions on Control Systems Technology, 13, 4, pp. 559-576, (2005)
[3]  
Sung S.W., Lee I.B., Limitations and countermeasures of PID controllers, Industrial & Engineering Chemistry Research, 35, 8, pp. 2596-2610, (1996)
[4]  
Camacho E.F., Alba C.B., Model Predictive Control, (2013)
[5]  
Qin S.J., Badgwell T.A., An overview of industrial model predictive control technology, Aiche Symposium Series, 93, 316, pp. 232-256, (1997)
[6]  
Garcia C.E., Prett D.M., Morari M., Model predictive control: Theory and practice-A survey, Automatica, 25, 3, pp. 335-348, (1989)
[7]  
Lee J.M., Lee J.H., Value function-based approach to the scheduling of multiple controllers, Journal of Process Control, 18, 6, pp. 533-542, (2008)
[8]  
Park T., Kim M., Lee J.H., Kong N.W., Park Y., Park P.G., PID controller gain tuning algorithm using reinforcement learning, Proc. of 2021 36Th ICROS Annual Conference, pp. 689-690, (2021)
[9]  
Hoskins J.C., Himmelblau D.M., Process control via artificial neural networks and reinforcement learning, Computers & Chemical Engineering, 16, 4, pp. 241-251, (1992)
[10]  
Syafiie S., Tadeo F., Martinez E., Model-free learning control of neutralization processes using reinforcement learning, Engineering Applications of Artificial Intelligence, 20, 6, pp. 767-782, (2007)