Deep reinforcement learning with shallow controllers: An experimental application to PID tuning

被引:70
作者
Lawrence, Nathan P. [1 ]
Forbes, Michael G. [3 ]
Loewen, Philip D. [1 ]
McClement, Daniel G. [2 ]
Backstrom, Johan U. [4 ]
Gopaluni, R. Bhushan [2 ]
机构
[1] Univ British Columbia, Dept Math, Vancouver, BC, Canada
[2] Univ British Columbia, Dept Chem & Biol Engn, Vancouver, BC, Canada
[3] Honeywell Proc Solut, N Vancouver, BC, Canada
[4] Backstrom Syst Engn Ltd, N Vancouver, BC, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Reinforcement learning; Deep learning; PID control; Process control; Process systems engineering; POLICY GRADIENT; NEURAL-NETWORKS; SYSTEM; STRATEGY; PHASE;
D O I
10.1016/j.conengprac.2021.105046
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a "safe'' region of the parameter space; and the final product-a well-tuned PID controller-has a form that practitioners can reason about and deploy with confidence.
引用
收藏
页数:14
相关论文
共 50 条
[1]  
Achiam J., 2018, Spinning Up in Deep Reinforcement Learning
[2]  
[Anonymous], 2016, OPENAI GYM
[3]  
ASTROM KJ, 1984, AUTOMATICA, V20, P645, DOI 10.1016/0005-1098(84)90014-1
[4]   A Deep Reinforcement Learning Approach to Improve the Learning Performance in Process Control [J].
Bao, Yaoyao ;
Zhu, Yuanming ;
Qian, Feng .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2021, 60 (15) :5504-5515
[5]  
Berger M. A., 2013, IFAC PROCE, V46, P534, DOI DOI 10.3182/20130703-3-FR-4038.00129
[6]   An experimental comparison of PID autotuners [J].
Berner, Josefin ;
Soltesz, Kristian ;
Hagglund, Tore ;
Astrom, Karl Johan .
CONTROL ENGINEERING PRACTICE, 2018, 73 :124-133
[7]  
Brujeni LA, 2010, INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), P453
[8]   Incremental Q-learning strategy for adaptive PID control of mobile robots [J].
Carlucho, Ignacio ;
De Paula, Mariano ;
Villar, Sebastian A. ;
Acosta, Gerardo G. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 :183-199
[9]  
Cho K., 2014, ARXIV14061078, DOI [10.48550/arXiv.1406.1078, DOI 10.3115/V1/D14-1179]
[10]  
Cui YD, 2018, IEEE INT CON AUTO SC, P304, DOI 10.1109/COASE.2018.8560593