Design of a Reinforcement Learning PID controller

被引:12
作者
Guan, Zhe [1 ]
Yamamoto, Tom [2 ]
机构
[1] Hiroshima Univ, Dream Driven Cocreat Res Ctr, KOBELCO Construct Machinery, Higashihiroshima, Japan
[2] Hiroshima Univ, Grad Sch Adv Sci & Engn, Higashihiroshima, Japan
来源
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年
关键词
Adaptive control; PID control; Reinforcement Learning; NETWORKS;
D O I
10.1109/ijcnn48605.2020.9207641
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses a design problem of a Proportional-Integral-Derivative (PID) controller with new adaptive updating rule based on Reinforcement Learning (RL) approach for nonlinear systems. A new design scheme that RL can be used to complement the conventional control technology PID is presented. In this study, a single Radial Basis Function (RBF) network is introduced to calculate the control policy function of Actor and the value function of Critic simultaneously. Regarding to the PID controller structure, the inputs of RBF network are system error, the difference of output as well as the second order difference of output, and they are defined as system states. The Temporal Difference (TD) error in this study is newly defined and involves the error criterion which is defined by the difference between one-step ahead prediction and the reference value. The gradient descent method is adopted based on TD error performance index, then the updating rules can be obtained. Therefore, the network weights and the kernel function can be calculated in an adaptive manner. Finally, the numerical simulations are conducted in nonlinear systems to illustrate the efficiency and robustness of the proposed scheme.
引用
收藏
页数:6
相关论文
共 27 条
[1]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[2]  
Bishop CM., 2006, Pattern Recognition and Machine Learning
[3]   A multivariable on-line adaptive PID controller using auto-tuning neurons [J].
Chang, WD ;
Hwang, RC ;
Hsieh, JG .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2003, 16 (01) :57-63
[4]  
Chen JH, 2004, J PROCESS CONTR, V14, P211, DOI 10.1016/S0959-1524(03)000039-8
[5]  
Chien K.L., 1952, T AM SOC MECH ENG, V74, P175
[6]   RADIAL BASIS FUNCTION NEURAL-NETWORK FOR APPROXIMATION AND ESTIMATION OF NONLINEAR STOCHASTIC DYNAMIC-SYSTEMS [J].
ELANAYAR, S ;
SHIN, YC .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :594-603
[7]  
Ferdowsi A., 2018, INT C INT TRANSP SYS
[8]  
Hagglund T., 1995, PID CONTROLLERS THEO
[9]   An Overview of Dynamic-Linearization-Based Data-Driven Control and Applications [J].
Hou, Zhongsheng ;
Chi, Ronghu ;
Gao, Huijun .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2017, 64 (05) :4076-4090
[10]   On-line PID tuning for engine idle-speed control using continuous action reinforcement learning automata [J].
Howell, MN ;
Best, MC .
CONTROL ENGINEERING PRACTICE, 2000, 8 (02) :147-154