A survey and comparative evaluation of actor-critic methods in process control

被引：24

作者：

Dutta, Debaprasad ^{[1
]}

Upreti, Simant R. ^{[1
]}

机构：

[1] Toronto Metropolitan Univ, Dept Chem Engn, Toronto, ON, Canada

来源：

CANADIAN JOURNAL OF CHEMICAL ENGINEERING | 2022年 / 100卷 / 09期

基金：

加拿大自然科学与工程研究理事会;

关键词：

actor-critic methods; process control; reinforcement learning; MODEL-PREDICTIVE CONTROL; LEARNING CONTROL; BATCH PROCESSES; NEURO-CONTROL; REINFORCEMENT; SYSTEM; PERFORMANCE; FRAMEWORK;

D O I：

10.1002/cjce.24508

中图分类号：

TQ [化学工业];

学科分类号：

0817 ;

摘要：

Actor-critic (AC) methods have emerged as an important class of reinforcement learning (RL) paradigm that enables model-free control by acting on a process and learning from the consequence. To that end, these methods utilize artificial neural networks, which are synergized for action evaluation and optimal action prediction. This feature is highly desirable for process control, especially when the knowledge about a process is limited or when it is susceptible to uncertainties. In this work, we summarize important concepts of AC methods and survey their process control applications. This treatment is followed by a comparative evaluation of the set-point tracking and robustness of controllers based on five prominent AC methods, namely, DDPG, TD3, SAC, PPO, and TRPO, in five case studies of varying process nonlinearity. The training demands and control performances indicate the superiority of DDPG and TD3 methods, which rely on off-policy, deterministic search for optimal action policies. Overall, the knowledge base and results of this work are expected to serve practitioners in their efforts toward further development of autonomous process control strategies.

引用

页码：2028 / 2056

页数：29

共 114 条

[1] A reinforcement learning-based economic model predictive control framework for autonomous operation of chemical reactors [J].

Alhazmi, Khalid ;

Albalawi, Fahad ;

Sarathy, S. Mani .

CHEMICAL ENGINEERING JOURNAL, 2022, 428

[2]

Andrychowicz M., 2006, ARXIV PREPRINT 2020

[3]

[Anonymous], 2004, SUPERVISED ACTOR CRI

[4]

[Anonymous], 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning

[5] Autonomous Fault Diagnosis and Root Cause Analysis for the Processing System Using One-Class SVM and NN Permutation Algorithm [J].

Arunthavanathan, Rajeevan ;

Khan, Faisal ;

Ahmed, Salim ;

Imtiaz, Syed .

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2022, 61 (03) :1408-1422

[6] Deep reinforcement learning control of hydraulic fracturing [J].

Bangi, Mohammed Saad Faizan ;

Kwon, Joseph Sang-Il .

COMPUTERS & CHEMICAL ENGINEERING, 2021, 154

[7] A Deep Reinforcement Learning Approach to Improve the Learning Performance in Process Control [J].

Bao, Yaoyao ;

Zhu, Yuanming ;

Qian, Feng .

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2021, 60 (15) :5504-5515

[8]

Barto A.G., 2004, HDB LEARNING APPROXI, V10, DOI [10.1002/9780470544785.ch2, DOI 10.1002/9780470544785.CH2]

[9] NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].

BARTO, AG ;

SUTTON, RS ;

ANDERSON, CW .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846

[10]

Bergdahl, 2017, THESIS KTH

← 1 2 3 4 5 6 7 8 9 10 →