Enhancing offline reinforcement learning for wastewater treatment via transition filter and prioritized approximation loss

被引：0

作者：

Yang, Ruyue ^{[1
,2
,3
,4
]}

Wang, Ding ^{[1
,2
,3
,4
]}

Li, Menghua ^{[1
,2
,3
,4
]}

Cui, Chengyu ^{[4
,5
]}

Qiao, Junfei ^{[1
,2
,3
,4
]}

机构：

[1] Beijing Univ Technol, Sch Informat Sci & Technol, Beijing, Peoples R China

[2] Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing, Peoples R China

[3] Beijing Univ Technol, Beijing Lab Smart Environm Protect, Beijing, Peoples R China

[4] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing, Peoples R China

[5] State Grid Corp China, State Grid Beijing Chaoyang Power Supply Branch, Beijing, Peoples R China

来源：

NEUROCOMPUTING | 2025年 / 636卷

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Offline reinforcement learning; Wastewater treatment; Variational autoencoder; Adaptive dynamic programming; LEVEL; DESIGN;

D O I：

10.1016/j.neucom.2025.129977

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Wastewater treatment plays a crucial role in urban society, requiring efficient control strategies to optimize its performance. In this paper, we propose an enhanced offline reinforcement learning (RL) approach for wastewater treatment. Our algorithm improves the learning process. It uses a transition filter to sort out low- performance transitions and employs prioritized approximation loss to achieve prioritized experience replay with uniformly sampled loss. Additionally, the variational autoencoder is introduced to address the problem of distribution shift in offline RL. The proposed approach is evaluated on a nonlinear system and wastewater treatment simulation platform, demonstrating its effectiveness in achieving optimal control. The contributions of this paper include the development of an improved offline RL algorithm for wastewater treatment and the integration of transition filtering and prioritized approximation loss. Evaluation results demonstrate that the proposed algorithm achieves lower tracking error and cost.

引用

页数：10

共 50 条

[11] Fujimoto S., 2020, Advances in Neural Information Processing Systems, P14219
[12] Fujimoto S, 2019, PR MACH LEARN RES, V97
[13] Discounted Iterative Adaptive Critic Designs With Novel Stability Analysis for Tracking Control
Ha, Mingming
Wang, Ding
Liu, Derong
[J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (07) : 1262 - 1272
[14] Dynamic MOPSO-Based Optimal Control for Wastewater Treatment Process
Han, Hong-Gui
Liu, Zheng
Lu, Wei
Hou, Ying
Qiao, Jun-Fei
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (05) : 2518 - 2528
[15] Data-Driven Multimodel Predictive Control for Multirate Sampled-Data Nonlinear Systems
Han, Honggui
Fu, Shijia
Sun, Haoyuan
Qiao, Junfei
[J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, 20 (03) : 2182 - 2194
[16] ROBUST ESTIMATION OF LOCATION PARAMETER
HUBER, PJ
[J]. ANNALS OF MATHEMATICAL STATISTICS, 1964, 35 (01): : 73 - &
[17] Municipal wastewater treatment costs with an emphasis on assimilation wetlands in the Louisiana coastal zone
Hunter, Rachael G.
Day, John W.
Wiegman, Adrian R.
Lane, Robert R.
[J]. ECOLOGICAL ENGINEERING, 2019, 137 : 21 - 25
[18] Champion-level drone racing using deep reinforcement learning
Kaufmann, Elia
Bauersfeld, Leonard
Loquercio, Antonio
Mueller, Matthias
Koltun, Vladlen
Scaramuzza, Davide
[J]. NATURE, 2023, 620 (7976) : 982 - +
[19] Kingma D.P., 2015, ACS SYM SER, DOI 10.48550/arXiv.1412.6980
[20] Optimal and Autonomous Control Using Reinforcement Learning: A Survey
Kiumarsi, Bahare
Vamvoudakis, Kyriakos G.
Modares, Hamidreza
Lewis, Frank L.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2042 - 2062

← 1 2 3 4 5 →