Optimal Directed Control of Discrete Event Systems Based on Reinforcement Learning

被引:0
作者
Hu, Yu-Hong [1 ]
Wang, De-Guang [1 ]
Yang, Ming [1 ]
Wang, Xi [2 ]
机构
[1] The Electrical Engineering College, Guizhou University, Guizhou, Guiyang
[2] School of Eletro-Mechanical Engineering, Xidian University, Shaanxi, Xi'an
来源
Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2024年 / 52卷 / 09期
基金
中国国家自然科学基金;
关键词
directed supervisor; discrete-event system; numerical optimiza-; optimal control; reinforcement learning;
D O I
10.12263/DZXB.20221267
中图分类号
学科分类号
摘要
In the case that several controllable events (control commands) are allowed to execute simultaneously, the supervisor in the framework of discrete event systems (DESs) selects one randomly. However, in practical applications, such as traffic scheduling and robot path planning, the problems of directed control and numerical optimization should be considered. This paper introduces an optimization mechanism to quantify the control cost and combines supervisory control theory (SCT) with reinforcement learning. A systematic procedure is proposed to synthesize the optimal directed supervisor of a DES based on reinforcement learning, which makes the controlled system achieve the following three goals: (1) the control specifications relevant to security and liveness are not violated; (2) at most one controllable event can be executed at each state; (3) the cumulative cost of event execution from the initial state to a mark state is minimal. First, given the automaton models of the plant and specifications, the target automaton model is obtained by the synchronous operation of these two models; a cost function is defined and assigns the execution cost for each event in the target model. Second, the nonblocking and maximally permissive supervisor is synthesized by SCT. Finally, the supervisor is transformed into a Markov decision process and then the Q-learning algorithm is utilized to compute the optimal directed supervisor. Two applications are used to verify the effectiveness and correctness of the proposed method. The simulation results show that the proposed method can realize the directed control of the system, and the numerical cost of the directed supervisor is minimized. © 2024 Chinese Institute of Electronics. All rights reserved.
引用
收藏
页码:3172 / 3184
页数:12
相关论文
共 28 条
[1]  
RAMADGE P J G, WONHAM W M., The control of discrete event systems, Proceedings of the IEEE, 77, 1, pp. 81-98, (1989)
[2]  
CAI K, WONHAM W M., Supervisory control of discrete-event systems, Encyclopedia of Systems and Control, pp. 2245-2253, (2021)
[3]  
XIN Z Y., Computation and simulation platform for supervisory control of discrete event system based on RW structure, Industrial Control Computer, 22, 12, pp. 39-41, (2009)
[4]  
JIAO T, LIU Z G., Study on the supervisory control of discrete-event systems incorporating components with several working modes, Control Theory & Applications, 37, 3, pp. 534-539, (2020)
[5]  
SHI J X, SHU S L, LIN F, Et al., Control for safety of home electric usage based on supervisory control theory, Modern Architecture Electric, 5, 1, pp. 9-15, (2014)
[6]  
RONG S B, ZHU J, SHI B, Et al., The application of supervisory control theory to control system of multiple-task machine tool, Machine Tool & Hydraulics, 39, 22, pp. 85-87, (2011)
[7]  
TATSUMOTO Y, SHIRAISHI M, CAI K, Et al., Application of online supervisory control of discrete-event systems to multi-robot warehouse automation, Control Engineering Practice, 81, pp. 97-104, (2018)
[8]  
GONZALEZ A G C, ALVES M V S, VIANA G S, Et al., Supervisory control-based navigation architecture: A new framework for autonomous robots in industry 4.0 environments, IEEE Transactions on Industrial Informatics, 14, 4, pp. 1732-1743, (2018)
[9]  
UMEMOTO H, YAMASAKI T., Optimal LLP supervisor for discrete event systems based on reinforcement learning, 2015 IEEE International Conference on Systems, Man, and Cybernetics, pp. 545-550, (2015)
[10]  
KAYMAKCI O, ANIK V G, USTOGLU I., A local modular supervisory controller for a real railway station, 5th IET International Conference on System Safety 2010, pp. 1-6, (2010)