An adversarial twin-agent inverse proximal policy optimization guided by model predictive control

被引:0
作者
Gupta, Nikita [1 ,4 ]
Kandath, Harikumar [2 ]
Kodamana, Hariprasad [1 ,3 ,4 ]
机构
[1] Indian Inst Technol Delhi, Dept Chem Engn, Hauz Khas, New Delhi 110016, India
[2] Int Inst Informat Technol Hyderabad, Hyderabad, India
[3] Indian Inst Technol Delhi, Yardi Sch Artificial Intelligence, Hauz Khas, New Delhi, India
[4] Indian Inst Technol Delhi Abu Dhabi, Abu Dhabi, U Arab Emirates
关键词
Reinforcement learning; Proximal Policy Optimization; Inverse Reinforcement Learning (IRL); Adversarial IRL (AIRL); Discriminator; CHO-CELLS; TEMPERATURE; PRODUCTIVITY; CHALLENGES; METABOLISM; PROGRESS; SYSTEMS; IMPACT; MPC;
D O I
10.1016/j.compchemeng.2025.109124
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reward design is a key challenge in reinforcement learning (RL) as it directly affects the effectiveness of learned policies. Inverse Reinforcement Learning (IRL) attempts to solve this problem by learning reward functions from expert trajectories. This study utilizes a reward design using Adversarial IRL (AIRL) frameworks using expert trajectories from Model Predictive Control (MPC). On the contrary, there are also instances where a pre-defined reward function works well, indicating a potential trade-off between these two. To achieve this, we propose a twin-agent reinforcement learning framework where the first agent utilizes a pre-defined reward function, while the second agent learns reward in the AIRL setting guided by MPC with Proximal Policy Optimization (PPO) as the backbone (PPO-MPC-AIRL). The performance of the proposed algorithm has been tested using a case study, namely, mAb production in the bioreactor. The simulation results indicate that the proposed algorithm is able to reduce the root mean square error (RMSE) of set-point tracking by 18.38 % compared to the nominal PPO.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Process control of mAb production using multi-actor proximal policy optimization
    Gupta, Nikita
    Anand, Shikhar
    Joshi, Tanuja
    Kumar, Deepak
    Ramteke, Manojkumar
    Kodamana, Hariprasad
    DIGITAL CHEMICAL ENGINEERING, 2023, 8
  • [2] Proximal policy optimization with an integral compensator for quadrotor control
    Huan Hu
    Qing-ling Wang
    Frontiers of Information Technology & Electronic Engineering, 2020, 21 : 777 - 795
  • [3] Proximal policy optimization with an integral compensator for quadrotor control
    Hu, Huan
    Wang, Qing-ling
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (05) : 777 - 795
  • [4] Inverse Model Optimization by Differential Evolution to improve Neural Predictive Control
    Morales-Perez, Edgar Ademir
    Iba, Hitoshi
    2020 JOINT 11TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 21ST INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS-ISIS), 2020, : 325 - 332
  • [5] Enhancement of Control Performance for Degraded Robot Manipulators Using Digital Twin and Proximal Policy Optimization
    Park, Su-Young
    Lee, Cheonghwa
    Kim, Hyungjung
    Ahn, Sung-Hoon
    IEEE ACCESS, 2024, 12 : 19569 - 19583
  • [6] Hybrid CNN-LSTM and Proximal Policy Optimization Model for Traffic Light Control in a Multi-Agent Environment
    Faqir, Nada
    Ennaji, Yassine
    Chakir, Loqman
    Boumhidi, Jaouad
    IEEE ACCESS, 2025, 13 : 29577 - 29588
  • [7] Melanoma classification using generative adversarial network and proximal policy optimization
    Ju, Xiangui
    Lin, Chi-Ho
    Lee, Suan
    Wei, Sizheng
    PHOTOCHEMISTRY AND PHOTOBIOLOGY, 2024,
  • [8] Intelligent Control of a Quadrotor with Proximal Policy Optimization Reinforcement Learning
    Lopes, Guilherme Cano
    Ferreira, Murillo
    Simoes, Alexandre da Silva
    Colombini, Esther Luna
    15TH LATIN AMERICAN ROBOTICS SYMPOSIUM 6TH BRAZILIAN ROBOTICS SYMPOSIUM 9TH WORKSHOP ON ROBOTICS IN EDUCATION (LARS/SBR/WRE 2018), 2018, : 503 - 508
  • [9] Optimal Control Algorithm for Subway Train Operation by Proximal Policy Optimization
    Chen, Bin
    Gao, Chunhai
    Zhang, Lei
    Chen, Junjie
    Chen, Jun
    Li, Yuyi
    APPLIED SCIENCES-BASEL, 2023, 13 (13):
  • [10] Multivariable PID Control Using Improved State Space Model Predictive Control Optimization
    Wu, Sheng
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2015, 54 (20) : 5505 - 5513