TASAC: A twin-actor reinforcement learning framework with a stochastic with an to batch control

被引：19

作者：

Joshi, Tanuja ^{[1
]}

Kodamana, Hariprasad ^{[1
,2
]}

Kandath, Harikumar ^{[3
]}

Kaisare, Niket ^{[4
]}

机构：

[1] Indian Inst Technol Delhi, Dept Chem Engn, New Delhi 110016, India

[2] Indian Inst Technol Delhi, Yardi Sch Artificial Intelligence, New Delhi 110016, India

[3] Int Inst Informat Technol Hyderabad, Hyderabad 500032, India

[4] Indian Inst Technol Madras, Dept Chem Engn, Chennai 600036, India

来源：

CONTROL ENGINEERING PRACTICE | 2023年 / 134卷

关键词：

Reinforcement learning; Actor-critic algorithm; Deep learning; Batch process; MODEL-PREDICTIVE CONTROL; TRANSESTERIFICATION; EFFICIENT; DESIGN; SYSTEM;

D O I：

10.1016/j.conengprac.2023.105462

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Due to their complex nonlinear dynamics and batch-to-batch variability, batch processes pose a challenge for process control. Due to the absence of accurate models and resulting plant-model mismatch, these problems become harder to address for advanced model-based control strategies. Reinforcement Learning (RL), wherein an agent learns the policy by directly interacting with the environment, offers a potential alternative in this context. RL frameworks with actor-critic architecture have recently become popular for controlling systems where state and action spaces are continuous. The current study proposes a stochastic actor-critic RL algorithm, termed Twin Actor Soft Actor-Critic (TASAC), by incorporating an ensemble of actors in a maximum entropy framework to improve learning due to enhanced exploration. The efficacy of the proposed approach is showcased by applying the same for the control of batch transesterification.

引用

页数：11

共 63 条

[1] A Deep Reinforcement Learning Approach to Improve the Learning Performance in Process Control [J].

Bao, Yaoyao ;

Zhu, Yuanming ;

Qian, Feng .

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2021, 60 (15) :5504-5515

[2]

Campos G., 2022, IND ENG CHEM RES

[3]

Chanpirak A., 2017, P INT MULTICONFERENC, V5

[4] Deterministic and stochastic model based run-to-run control for batch processes with measurement delays of uncertain duration [J].

Chen, Junghui ;

Munoz, Jose ;

Cheng, Ning .

JOURNAL OF PROCESS CONTROL, 2012, 22 (02) :508-517

[5] Online Implementation of a Soft Actor-Critic Agent to Enhance Indoor Temperature Control and Energy Efficiency in Buildings [J].

Coraci, Davide ;

Brandi, Silvio ;

Piscitelli, Marco Savino ;

Capozzoli, Alfonso .

ENERGIES, 2021, 14 (04)

[6] Constrained iterative learning control of batch transesterification process under uncertainty [J].

De, Riju ;

Bhartiya, Sharad ;

Shastri, Yogendra .

CONTROL ENGINEERING PRACTICE, 2020, 103

[7]

De R, 2016, 2016 INDIAN CONTROL CONFERENCE (ICC), P117, DOI 10.1109/INDIANCC.2016.7441115

[8]

Degris Thomas., 2012, ICML

[9] Reinforcement learning approach to autonomous PID tuning [J].

Dogru, Oguzhan ;

Velswamy, Kirubakaran ;

Ibrahim, Fadi ;

Wu, Yuqi ;

Sundaramoorthy, Arun Senthil ;

Huang, Biao ;

Xu, Shu ;

Nixon, Mark ;

Bell, Noel .

COMPUTERS & CHEMICAL ENGINEERING, 2022, 161

[10] Online reinforcement learning for a continuous space system with experimental validation [J].

Dogru, Oguzhan ;

Wieczorek, Nathan ;

Velswamy, Kirubakaran ;

Ibrahim, Fadi ;

Huang, Biao .

JOURNAL OF PROCESS CONTROL, 2021, 104 (104) :86-100

← 1 2 3 4 5 6 7 →