Integration of reinforcement learning and model predictive control to optimize semi-batch bioreactor

被引：32

作者：

Oh, Tae Hoon ^{[1
]}

Park, Hyun Min ^{[1
]}

Kim, Jong Woo ^{[2
]}

Lee, Jong Min ^{[1
]}

机构：

[1] Seoul Natl Univ, Sch Chem & Biol Engn, Inst Chem Proc, 1 Gwanak Ro, Seoul 08826, South Korea

[2] Tech Univ Berlin, Bioproc Engn, Berlin, Germany

来源：

AICHE JOURNAL | 2022年 / 68卷 / 06期

基金：

新加坡国家研究基金会;

关键词：

bioprocess; deep neural network; model predictive control; optimal control; reinforcement learning; FED-BATCH FERMENTATION; PENICILLIN PRODUCTION; STRUCTURED MODEL; BIG DATA; STABILITY;

D O I：

10.1002/aic.17658

中图分类号：

TQ [化学工业];

学科分类号：

0817 ;

摘要：

As the digital transformation of the bioprocess is progressing, several studies propose to apply data-based methods to obtain a substrate feeding strategy that minimizes the operating cost of a semi-batch bioreactor. However, the negligent application of model-free reinforcement learning (RL) has a high chance to fail on improving the existing control policy because the available amount of data is limited. In this article, we propose an integrated algorithm of double-deep Q-network and model predictive control. The proposed method learns the action-value function in an off-policy fashion and solves the model-based optimal control problem where the terminal cost is assigned by the action-value function. For simulation study, the proposed method, model-based method, and model-free methods are applied to the industrial scale penicillin process. The results show that the proposed method outperforms other methods, and it can learn with fewer data than model-free RL algorithms.

引用

页数：16

共 65 条

[1] Lipid production optimization and optimal control of heterotrophic microalgae fed-batch bioreactor [J].

Abdollahi, Javad ;

Dubljevic, Stevan .

CHEMICAL ENGINEERING SCIENCE, 2012, 84 :619-627

[2] CasADi: a software framework for nonlinear optimization and optimal control [J].

Andersson, Joel A. E. ;

Gillis, Joris ;

Horn, Greg ;

Rawlings, James B. ;

Diehl, Moritz .

MATHEMATICAL PROGRAMMING COMPUTATION, 2019, 11 (01) :1-36

[3]

Bertsekas D.P., 2019, Reinforcement Learning and Optimal Control

[4]

Bertsekas Dimitri P, 2000, Dynamic Programming and Optimal Control, V1

[5] Dynamic programming and suboptimal control: A survey from ADP to MPC [J].

Bertsekas, DP .

EUROPEAN JOURNAL OF CONTROL, 2005, 11 (4-5) :310-334

[6] A modular simulation package for fed-batch fermentation:: penicillin production [J].

Birol, G ;

Ündey, C ;

Çinar, A .

COMPUTERS & CHEMICAL ENGINEERING, 2002, 26 (11) :1553-1565

[7] Reinforcement learning for control: Performance, stability, and deep approximators [J].

Busoniu, Lucian ;

de Bruin, Tim ;

Tolic, Domagoj ;

Kober, Jens ;

Palunko, Ivana .

ANNUAL REVIEWS IN CONTROL, 2018, 46 :8-28

[8] Nonlinear model predictive control of fed-batch fermentations using dynamic flux balance models [J].

Chang, Liang ;

Liu, Xinggao ;

Henson, Michael A. .

JOURNAL OF PROCESS CONTROL, 2016, 42 :137-149

[9]

Chollet F., 2015, Keras

[10] Reinforcement Learning for Dynamic Microfluidic Control [J].

Dressler, Oliver J. ;

Howes, Philip D. ;

Choo, Jaebum ;

deMello, Andrew J. .

ACS OMEGA, 2018, 3 (08) :10084-10091

← 1 2 3 4 5 6 7 →