TASAC: A twin-actor reinforcement learning framework with a stochastic with an to batch control

被引:19
作者
Joshi, Tanuja [1 ]
Kodamana, Hariprasad [1 ,2 ]
Kandath, Harikumar [3 ]
Kaisare, Niket [4 ]
机构
[1] Indian Inst Technol Delhi, Dept Chem Engn, New Delhi 110016, India
[2] Indian Inst Technol Delhi, Yardi Sch Artificial Intelligence, New Delhi 110016, India
[3] Int Inst Informat Technol Hyderabad, Hyderabad 500032, India
[4] Indian Inst Technol Madras, Dept Chem Engn, Chennai 600036, India
关键词
Reinforcement learning; Actor-critic algorithm; Deep learning; Batch process; MODEL-PREDICTIVE CONTROL; TRANSESTERIFICATION; EFFICIENT; DESIGN; SYSTEM;
D O I
10.1016/j.conengprac.2023.105462
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to their complex nonlinear dynamics and batch-to-batch variability, batch processes pose a challenge for process control. Due to the absence of accurate models and resulting plant-model mismatch, these problems become harder to address for advanced model-based control strategies. Reinforcement Learning (RL), wherein an agent learns the policy by directly interacting with the environment, offers a potential alternative in this context. RL frameworks with actor-critic architecture have recently become popular for controlling systems where state and action spaces are continuous. The current study proposes a stochastic actor-critic RL algorithm, termed Twin Actor Soft Actor-Critic (TASAC), by incorporating an ensemble of actors in a maximum entropy framework to improve learning due to enhanced exploration. The efficacy of the proposed approach is showcased by applying the same for the control of batch transesterification.
引用
收藏
页数:11
相关论文
共 63 条
[1]   A Deep Reinforcement Learning Approach to Improve the Learning Performance in Process Control [J].
Bao, Yaoyao ;
Zhu, Yuanming ;
Qian, Feng .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2021, 60 (15) :5504-5515
[2]  
Campos G., 2022, IND ENG CHEM RES
[3]  
Chanpirak A., 2017, P INT MULTICONFERENC, V5
[4]   Deterministic and stochastic model based run-to-run control for batch processes with measurement delays of uncertain duration [J].
Chen, Junghui ;
Munoz, Jose ;
Cheng, Ning .
JOURNAL OF PROCESS CONTROL, 2012, 22 (02) :508-517
[5]   Online Implementation of a Soft Actor-Critic Agent to Enhance Indoor Temperature Control and Energy Efficiency in Buildings [J].
Coraci, Davide ;
Brandi, Silvio ;
Piscitelli, Marco Savino ;
Capozzoli, Alfonso .
ENERGIES, 2021, 14 (04)
[6]   Constrained iterative learning control of batch transesterification process under uncertainty [J].
De, Riju ;
Bhartiya, Sharad ;
Shastri, Yogendra .
CONTROL ENGINEERING PRACTICE, 2020, 103
[7]  
De R, 2016, 2016 INDIAN CONTROL CONFERENCE (ICC), P117, DOI 10.1109/INDIANCC.2016.7441115
[8]  
Degris Thomas., 2012, ICML
[9]   Reinforcement learning approach to autonomous PID tuning [J].
Dogru, Oguzhan ;
Velswamy, Kirubakaran ;
Ibrahim, Fadi ;
Wu, Yuqi ;
Sundaramoorthy, Arun Senthil ;
Huang, Biao ;
Xu, Shu ;
Nixon, Mark ;
Bell, Noel .
COMPUTERS & CHEMICAL ENGINEERING, 2022, 161
[10]   Online reinforcement learning for a continuous space system with experimental validation [J].
Dogru, Oguzhan ;
Wieczorek, Nathan ;
Velswamy, Kirubakaran ;
Ibrahim, Fadi ;
Huang, Biao .
JOURNAL OF PROCESS CONTROL, 2021, 104 (104) :86-100