Integration of reinforcement learning and model predictive control to optimize semi-batch bioreactor

被引:32
作者
Oh, Tae Hoon [1 ]
Park, Hyun Min [1 ]
Kim, Jong Woo [2 ]
Lee, Jong Min [1 ]
机构
[1] Seoul Natl Univ, Sch Chem & Biol Engn, Inst Chem Proc, 1 Gwanak Ro, Seoul 08826, South Korea
[2] Tech Univ Berlin, Bioproc Engn, Berlin, Germany
基金
新加坡国家研究基金会;
关键词
bioprocess; deep neural network; model predictive control; optimal control; reinforcement learning; FED-BATCH FERMENTATION; PENICILLIN PRODUCTION; STRUCTURED MODEL; BIG DATA; STABILITY;
D O I
10.1002/aic.17658
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
As the digital transformation of the bioprocess is progressing, several studies propose to apply data-based methods to obtain a substrate feeding strategy that minimizes the operating cost of a semi-batch bioreactor. However, the negligent application of model-free reinforcement learning (RL) has a high chance to fail on improving the existing control policy because the available amount of data is limited. In this article, we propose an integrated algorithm of double-deep Q-network and model predictive control. The proposed method learns the action-value function in an off-policy fashion and solves the model-based optimal control problem where the terminal cost is assigned by the action-value function. For simulation study, the proposed method, model-based method, and model-free methods are applied to the industrial scale penicillin process. The results show that the proposed method outperforms other methods, and it can learn with fewer data than model-free RL algorithms.
引用
收藏
页数:16
相关论文
共 65 条
[1]   Lipid production optimization and optimal control of heterotrophic microalgae fed-batch bioreactor [J].
Abdollahi, Javad ;
Dubljevic, Stevan .
CHEMICAL ENGINEERING SCIENCE, 2012, 84 :619-627
[2]   CasADi: a software framework for nonlinear optimization and optimal control [J].
Andersson, Joel A. E. ;
Gillis, Joris ;
Horn, Greg ;
Rawlings, James B. ;
Diehl, Moritz .
MATHEMATICAL PROGRAMMING COMPUTATION, 2019, 11 (01) :1-36
[3]  
Bertsekas D.P., 2019, Reinforcement Learning and Optimal Control
[4]  
Bertsekas Dimitri P, 2000, Dynamic Programming and Optimal Control, V1
[5]   Dynamic programming and suboptimal control: A survey from ADP to MPC [J].
Bertsekas, DP .
EUROPEAN JOURNAL OF CONTROL, 2005, 11 (4-5) :310-334
[6]   A modular simulation package for fed-batch fermentation:: penicillin production [J].
Birol, G ;
Ündey, C ;
Çinar, A .
COMPUTERS & CHEMICAL ENGINEERING, 2002, 26 (11) :1553-1565
[7]   Reinforcement learning for control: Performance, stability, and deep approximators [J].
Busoniu, Lucian ;
de Bruin, Tim ;
Tolic, Domagoj ;
Kober, Jens ;
Palunko, Ivana .
ANNUAL REVIEWS IN CONTROL, 2018, 46 :8-28
[8]   Nonlinear model predictive control of fed-batch fermentations using dynamic flux balance models [J].
Chang, Liang ;
Liu, Xinggao ;
Henson, Michael A. .
JOURNAL OF PROCESS CONTROL, 2016, 42 :137-149
[9]  
Chollet F., 2015, Keras
[10]   Reinforcement Learning for Dynamic Microfluidic Control [J].
Dressler, Oliver J. ;
Howes, Philip D. ;
Choo, Jaebum ;
deMello, Andrew J. .
ACS OMEGA, 2018, 3 (08) :10084-10091