Nonzero-Sum Game Reinforcement Learning for Performance Optimization in Large-Scale Industrial Processes

被引:49
作者
Li, Jinna [1 ,2 ]
Ding, Jinliang [2 ]
Chai, Tianyou [2 ]
Lewis, Frank L. [2 ,3 ]
机构
[1] Liaoning Shihua Univ, Sch Informat & Control Engn, Fushun 113001, Liaoning, Peoples R China
[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China
[3] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA
基金
中国国家自然科学基金;
关键词
Optimization; Production; Games; Heuristic algorithms; Nash equilibrium; Reinforcement learning; Game theory; plant-wide performance optimization; reinforcement learning; DIFFERENTIAL GRAPHICAL GAMES; MULTIAGENT SYSTEMS; DESIGN; SYNCHRONIZATION;
D O I
10.1109/TCYB.2019.2950262
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article presents a novel technique to achieve plant-wide performance optimization for large-scale unknown industrial processes by integrating the reinforcement learning method with the multiagent game theory. A main advantage of this technique is that plant-wide optimal performance is achieved by a distributed approach where multiple agents solve simplified local nonzero-sum optimization problems so that a global Nash equilibrium is reached. To this end, first, the plant-wide performance optimization problem is reformulated by decomposition into local optimization subproblems for each production index in a multiagent framework. Then, the nonzero-sum graphical game theory is utilized to compute the operational indices for each unit process with the purpose of reaching the global Nash equilibrium, resulting in production indices following their prescribed target values. The stability and the global Nash equilibrium of this multiagent graphical game solution are rigorously proved. The reinforcement learning methods are then developed for each agent to solve the nonzero-sum graphical game problem using data measurements available in the system in real time. The plant dynamics do not have to be known. Finally, the emulation results are given to show the effectiveness of the proposed automated decision algorithm by using measured data from a large mineral processing plant in Gansu Province, China.
引用
收藏
页码:4132 / 4145
页数:14
相关论文
共 43 条
  • [1] [Anonymous], 2010, CHEM ENG COMMUN, DOI DOI 10.1080/00986440903358915
  • [2] Basar T., 1999, DYNAMIC NONCOOPERATI
  • [3] Data-Based Multiobjective Plant-Wide Performance Optimization of Industrial Processes Under Dynamic Environments
    Ding, Jinliang
    Modares, Hamidreza
    Chai, Tianyou
    Lewis, Frank L.
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2016, 12 (02) : 454 - 465
  • [4] Data-Driven Cooperative Output Regulation of Multi-Agent Systems via Robust Adaptive Dynamic Programming
    Gao, Weinan
    Jiang, Yu
    Davari, Masoud
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66 (03) : 447 - 451
  • [5] Cooperative Optimal Control for Multi-Agent Systems on Directed Graph Topologies
    Hengster-Movric, Kristian
    Lewis, Frank L.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (03) : 769 - 774
  • [6] Huang X., 2005, P IFAC JUL, V38, P178
  • [7] Data-Driven Distributed Output Consensus Control for Partially Observable Multiagent Systems
    Jiang, He
    He, Haibo
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (03) : 848 - 858
  • [8] Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
    Jiang, Yu
    Jiang, Zhong-Ping
    [J]. AUTOMATICA, 2012, 48 (10) : 2699 - 2704
  • [9] Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control
    Jiao, Qiang
    Modares, Hamidreza
    Xu, Shengyuan
    Lewis, Frank L.
    Vamvoudakis, Kyriakos G.
    [J]. AUTOMATICA, 2016, 69 : 24 - 34
  • [10] Approximate N-Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System
    Johnson, Marcus
    Kamalapurkar, Rushikesh
    Bhasin, Shubhendu
    Dixon, Warren E.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (08) : 1645 - 1658