Sim-to-real transfer in reinforcement learning-based, non-steady-state control for chemical plants

被引:1
作者
Kubosawa, Shumpei [1 ,2 ]
Onishi, Takashi [1 ,2 ]
Tsuruoka, Yoshimasa [1 ,3 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, NEC AIST AI Cooperat Res Lab, Tokyo, Japan
[2] NEC Corp Ltd, Data Sci Res Labs, Kawasaki, Kanagawa, Japan
[3] Univ Tokyo, Dept Informat & Commun Engn, Tokyo, Japan
关键词
Reinforcement learning; dynamic simulator; nonlinear system identification; disturbance rejection; chemical plant; MODEL PREDICTIVE CONTROL; GRADE TRANSITIONS;
D O I
10.1080/18824889.2022.2029033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a novel framework for controlling non-steady situations in chemical plants to address the behavioural gaps between the simulator for constructing the reinforcement learning-based controller and the real plant considered for deploying the framework. In the field of reinforcement learning, the performance deterioration problem owing to such gaps are referred to as simulation-to-reality gaps (Sim-to-Real gaps). These gaps are triggered by multiple factors, including modelling errors on the simulators, incorrect state identifications, and unpredicted disturbances on the real situations. We focus on these issues and divided the objective of performing optimal control under gapped situations into three tasks, namely, (1) identifying the model parameters and current state, (2) optimizing the operation procedures, and (3) letting the real situations close to the simulated and predicted situations by adjusting the control inputs. Each task is assigned to a reinforcement learning agent and trained individually. After the training, the agents are integrated and collaborate on the original objective. We present the evaluation of our method in an actual chemical distillation plant, which demonstrates that our system successfully narrows down the gaps due to the emulated disturbance of a weather change (heavy rain) as well as the modelling errors and achieves the desired states.
引用
收藏
页码:10 / 23
页数:14
相关论文
共 23 条
[1]  
Abe N, 2003, SICE 2003 ANNUAL CONFERENCE, VOLS 1-3, P1383
[2]  
[Anonymous], 2018, Comput. Aided Chem. Eng.
[3]   A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking [J].
Arulampalam, MS ;
Maskell, S ;
Gordon, N ;
Clapp, T .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2002, 50 (02) :174-188
[4]  
Brockman G, 2016, Arxiv, DOI arXiv:1606.01540
[5]  
Friman M., 2012, IFAC Proceed Vol, V45, P548, DOI [10.3182/20120328-3-IT-3014.00093, DOI 10.3182/20120328-3-IT-3014.00093]
[6]  
Fujita Y., 2019, WORKSH DEEP REINF LE
[7]   MODEL PREDICTIVE CONTROL - THEORY AND PRACTICE - A SURVEY [J].
GARCIA, CE ;
PRETT, DM ;
MORARI, M .
AUTOMATICA, 1989, 25 (03) :335-348
[8]   On-line PID tuning for engine idle-speed control using continuous action reinforcement learning automata [J].
Howell, MN ;
Best, MC .
CONTROL ENGINEERING PRACTICE, 2000, 8 (02) :147-154
[9]  
Kubosawa S, 2021, 2021 60TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), P799
[10]   Computing operation procedures for chemical plants using whole-plant simulation models [J].
Kubosawa, Shumpei ;
Onishi, Takashi ;
Tsuruoka, Yoshimasa .
CONTROL ENGINEERING PRACTICE, 2021, 114