Batch process control based on reinforcement learning with segmented prioritized experience replay

被引：2

作者：

Xu, Chen ^{[1
]}

Ma, Junwei ^{[1
]}

Tao, Hongfeng ^{[1
]}

机构：

[1] Jiangnan Univ, Key Lab Adv Proc Control Light Ind, Minist Educ, Wuxi 214122, Peoples R China

来源：

MEASUREMENT SCIENCE AND TECHNOLOGY | 2024年 / 35卷 / 05期

基金：

中国国家自然科学基金;

关键词：

reinforcement learning; batch process; soft actor-critic; priority experience replay; maximum entropy framework;

D O I：

10.1088/1361-6501/ad21cf

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Batch process is difficult to control accurately due to their complex nonlinear dynamics and unstable operating conditions. The traditional methods such as model predictive control, will seriously affect control performance when process model is inaccurate. In contrast, reinforcement learning (RL) provides an viable alternative by interacting directly with the environment to learn optimal strategy. This paper proposes a batch process controller based on the segmented prioritized experience replay (SPER) soft actor-critic (SAC). SAC combines off-policy updates and maximum entropy RL with an actor-critic formulation, which can obtain a more robust control strategy than other RL methods. To improve the efficiency of the experience replay mechanism in tasks with long episodes and multiple phases, a new method of sampling experience called SPER is designed in SAC. In addition, a novel reward function is set for the SPER-SAC based controller to deal with the sparse reward. Finally, the effectiveness of the SPER-SAC based controller for batch process examples is demonstrated by comparing with the conventional RL-based control methods.

引用

页数：12

共 50 条

[41] Multi-Input Autonomous Driving Based on Deep Reinforcement Learning With Double Bias Experience Replay
Cui, Jianping
Yuan, Liang
He, Li
Xiao, Wendong
Ran, Teng
Zhang, Jianbo
IEEE SENSORS JOURNAL, 2023, 23 (11) : 11253 - 11261
[42] Prioritized Experience Replay-Based Deep Q Learning: Multiple-Reward Architecture for Highway Driving Decision Making
Yuan, Wei
Li, Yueyuan
Zhuang, Hanyang
Wang, Chunxiang
Yang, Ming
IEEE ROBOTICS & AUTOMATION MAGAZINE, 2021, 28 (04) : 21 - 31
[43] Reinforcement Learning with Experience Replay for Model-Free Humanoid Walking Optimization
Wawrzynski, Pawel
INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2014, 11 (03)
[44] Double Broad Reinforcement Learning Based on Hindsight Experience Replay for Collision Avoidance of Unmanned Surface Vehicles
Yu, Jiabao
Chen, Jiawei
Chen, Ying
Zhou, Zhiguo
Duan, Junwei
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (12)
[45] Re-attentive experience replay in off-policy reinforcement learning
Wei, Wei
Wang, Da
Li, Lin
Liang, Jiye
MACHINE LEARNING, 2024, 113 (05) : 2327 - 2349
[46] Progression Cognition Reinforcement Learning With Prioritized Experience for Multi-Vehicle Pursuit
Li, Xinhang
Yang, Yiying
Yuan, Zheng
Wang, Zhe
Wang, Qinwen
Xu, Chen
Li, Lei
He, Jianhua
Zhang, Lin
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (08) : 10035 - 10048
[47] Re-attentive experience replay in off-policy reinforcement learning
Wei Wei
Da Wang
Lin Li
Jiye Liang
Machine Learning, 2024, 113 : 2327 - 2349
[48] Tractable Reinforcement Learning for Signal Temporal Logic Tasks With Counterfactual Experience Replay
Wang, Siqi
Yin, Xunyuan
Li, Shaoyuan
Yin, Xiang
IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 616 - 621
[49] Multi-Actor-Critic Deep Reinforcement Learning with Hindsight Experience Replay
Sehgal, Adarsh
Sehgal, Muskan
La, Hung Manh
ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I, 2025, 15046 : 31 - 43
[50] Intelligent Ship Collision Avoidance Algorithm Based on DDQN with Prioritized Experience Replay under COLREGs
Zhai, Pengyu
Zhang, Yingjun
Wang Shaobo
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (05)

← 1 2 3 4 5 →