Deep reinforcement learning with shallow controllers: An experimental application to PID tuning

被引：70

作者：

Lawrence, Nathan P. ^{[1
]}

Forbes, Michael G. ^{[3
]}

Loewen, Philip D. ^{[1
]}

McClement, Daniel G. ^{[2
]}

Backstrom, Johan U. ^{[4
]}

Gopaluni, R. Bhushan ^{[2
]}

机构：

[1] Univ British Columbia, Dept Math, Vancouver, BC, Canada

[2] Univ British Columbia, Dept Chem & Biol Engn, Vancouver, BC, Canada

[3] Honeywell Proc Solut, N Vancouver, BC, Canada

[4] Backstrom Syst Engn Ltd, N Vancouver, BC, Canada

来源：

CONTROL ENGINEERING PRACTICE | 2022年 / 121卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Reinforcement learning; Deep learning; PID control; Process control; Process systems engineering; POLICY GRADIENT; NEURAL-NETWORKS; SYSTEM; STRATEGY; PHASE;

D O I：

10.1016/j.conengprac.2021.105046

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a "safe'' region of the parameter space; and the final product-a well-tuned PID controller-has a form that practitioners can reason about and deploy with confidence.

引用

页数：14

共 50 条

[1]

Achiam J., 2018, Spinning Up in Deep Reinforcement Learning

[2]

[Anonymous], 2016, OPENAI GYM

[3]

ASTROM KJ, 1984, AUTOMATICA, V20, P645, DOI 10.1016/0005-1098(84)90014-1

[4] A Deep Reinforcement Learning Approach to Improve the Learning Performance in Process Control [J].

Bao, Yaoyao ;

Zhu, Yuanming ;

Qian, Feng .

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2021, 60 (15) :5504-5515

[5]

Berger M. A., 2013, IFAC PROCE, V46, P534, DOI DOI 10.3182/20130703-3-FR-4038.00129

[6] An experimental comparison of PID autotuners [J].

Berner, Josefin ;

Soltesz, Kristian ;

Hagglund, Tore ;

Astrom, Karl Johan .

CONTROL ENGINEERING PRACTICE, 2018, 73 :124-133

[7]

Brujeni LA, 2010, INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), P453

[8] Incremental Q-learning strategy for adaptive PID control of mobile robots [J].

Carlucho, Ignacio ;

De Paula, Mariano ;

Villar, Sebastian A. ;

Acosta, Gerardo G. .

EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 :183-199

[9]

Cho K., 2014, ARXIV14061078, DOI [10.48550/arXiv.1406.1078, DOI 10.3115/V1/D14-1179]

[10]

Cui YD, 2018, IEEE INT CON AUTO SC, P304, DOI 10.1109/COASE.2018.8560593

← 1 2 3 4 5 →