Nonzero-Sum Game Reinforcement Learning for Performance Optimization in Large-Scale Industrial Processes

被引：49

作者：

Li, Jinna ^{[1
,2
]}

Ding, Jinliang ^{[2
]}

Chai, Tianyou ^{[2
]}

Lewis, Frank L. ^{[2
,3
]}

机构：

[1] Liaoning Shihua Univ, Sch Informat & Control Engn, Fushun 113001, Liaoning, Peoples R China

[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China

[3] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2020年 / 50卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Optimization; Production; Games; Heuristic algorithms; Nash equilibrium; Reinforcement learning; Game theory; plant-wide performance optimization; reinforcement learning; DIFFERENTIAL GRAPHICAL GAMES; MULTIAGENT SYSTEMS; DESIGN; SYNCHRONIZATION;

D O I：

10.1109/TCYB.2019.2950262

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article presents a novel technique to achieve plant-wide performance optimization for large-scale unknown industrial processes by integrating the reinforcement learning method with the multiagent game theory. A main advantage of this technique is that plant-wide optimal performance is achieved by a distributed approach where multiple agents solve simplified local nonzero-sum optimization problems so that a global Nash equilibrium is reached. To this end, first, the plant-wide performance optimization problem is reformulated by decomposition into local optimization subproblems for each production index in a multiagent framework. Then, the nonzero-sum graphical game theory is utilized to compute the operational indices for each unit process with the purpose of reaching the global Nash equilibrium, resulting in production indices following their prescribed target values. The stability and the global Nash equilibrium of this multiagent graphical game solution are rigorously proved. The reinforcement learning methods are then developed for each agent to solve the nonzero-sum graphical game problem using data measurements available in the system in real time. The plant dynamics do not have to be known. Finally, the emulation results are given to show the effectiveness of the proposed automated decision algorithm by using measured data from a large mineral processing plant in Gansu Province, China.

引用

页码：4132 / 4145

页数：14

共 43 条

[1] [Anonymous], 2010, CHEM ENG COMMUN, DOI DOI 10.1080/00986440903358915
[2] Basar T., 1999, DYNAMIC NONCOOPERATI
[3] Data-Based Multiobjective Plant-Wide Performance Optimization of Industrial Processes Under Dynamic Environments
Ding, Jinliang
Modares, Hamidreza
Chai, Tianyou
Lewis, Frank L.
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2016, 12 (02) : 454 - 465
[4] Data-Driven Cooperative Output Regulation of Multi-Agent Systems via Robust Adaptive Dynamic Programming
Gao, Weinan
Jiang, Yu
Davari, Masoud
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66 (03) : 447 - 451
[5] Cooperative Optimal Control for Multi-Agent Systems on Directed Graph Topologies
Hengster-Movric, Kristian
Lewis, Frank L.
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (03) : 769 - 774
[6] Huang X., 2005, P IFAC JUL, V38, P178
[7] Data-Driven Distributed Output Consensus Control for Partially Observable Multiagent Systems
Jiang, He
He, Haibo
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (03) : 848 - 858
[8] Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
Jiang, Yu
Jiang, Zhong-Ping
[J]. AUTOMATICA, 2012, 48 (10) : 2699 - 2704
[9] Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control
Jiao, Qiang
Modares, Hamidreza
Xu, Shengyuan
Lewis, Frank L.
Vamvoudakis, Kyriakos G.
[J]. AUTOMATICA, 2016, 69 : 24 - 34
[10] Approximate N-Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System
Johnson, Marcus
Kamalapurkar, Rushikesh
Bhasin, Shubhendu
Dixon, Warren E.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (08) : 1645 - 1658

← 1 2 3 4 5 →