Optimal Digital Control with Uncertain Network Delay of Linear Systems Using Reinforcement Learning

被引:4
作者
Fujita, Taishi [1 ]
Ushio, Toshimitsu [1 ]
机构
[1] Osaka Univ, Grad Sch Engn Sci, Toyonaka, Osaka 5608531, Japan
来源
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES | 2016年 / E99A卷 / 02期
关键词
reinforcement learning; adaptive control; output feedback control; optimal control; linear system; OPTIMAL TRACKING CONTROL; DISCRETE-TIME-SYSTEMS; FEEDBACK-CONTROL;
D O I
10.1587/transfun.E99.A.454
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent development in network technology can realize the control of a remote plant by a digital controller. However, there is a delay caused by data transmission of control inputs and outputs. The delay degrades the control performance without taking it into consideration. In general, it is a difficult problem to identify the delay beforehand. We also assume that the plant's parameters have uncertainty. To solve the problem, we use reinforcement learning to achieve optimal digital control. First, we consider state feedback control. Next, we consider the case where the plant's outputs are observed, and apply reinforcement learning to output feedback control. Finally, we demonstrate by simulation that the proposed control method can search for the optimal gain and that it can adapt to the change of the delay.
引用
收藏
页码:454 / 461
页数:8
相关论文
共 33 条
[1]   Data-based optimal control [J].
Aangenent, W ;
Kostic, D ;
de Jager, B ;
van de Molengraft, R ;
Steinbuch, M .
ACC: PROCEEDINGS OF THE 2005 AMERICAN CONTROL CONFERENCE, VOLS 1-7, 2005, :1460-1465
[2]  
[Anonymous], 2013, Optimal adaptive control and differential games by reinforcement learning principles
[3]  
Astrom K.J., 1995, ADAPTIVE CONTROL
[4]  
Bertsekas D. P., 1996, NEURODYNAMIC PROGRAM
[5]  
BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475
[6]  
Bradtke StevenJ., 1996, Machine Learning, P22
[7]   Reinforcement learning in continuous time and space [J].
Doya, K .
NEURAL COMPUTATION, 2000, 12 (01) :219-245
[8]  
Fujita T., 2014, P MATH THEOR NETW SY, P672
[9]  
Fujita T, 2014, IEEE S ADAPTIVE DYNA, P1
[10]   A new delay system approach to network-based control [J].
Gao, Huijun ;
Chen, Tongwen ;
Lam, James .
AUTOMATICA, 2008, 44 (01) :39-52