Unified control of diverse actions in a wastewater treatment activated sludge system using reinforcement learning for multi-objective optimization

被引：5

作者：

Croll, Henry C. ^{[1
]}

Ikuma, Kaoru ^{[1
]}

Ong, Say Kee ^{[1
]}

Sarkar, Soumik ^{[2
]}

机构：

[1] Iowa State Univ, Dept Civil Construct & Environm Engn, Ames, IA 50011 USA

[2] Iowa State Univ, Dept Mech Engn, Ames, IA 50011 USA

来源：

WATER RESEARCH | 2024年 / 263卷

关键词：

Machine learning; operations; Control optimization; Energy efficiency; Benchmark Simulation Model No. 1 (BSM1); Twin Delayed Deep Deterministic Policy; Gradient (TD3); LONG-TERM LOW; GO;

D O I：

10.1016/j.watres.2024.122179

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

The operation of modern wastewater treatment facilities is a balancing act in which a multitude of variables are controlled to achieve a wide range of objectives, many of which are conflicting. This is especially true within secondary activated sludge systems, where significant research and industry effort has been devoted to advance control optimization strategies, both domain-driven and data-driven. Among data-driven control strategies, reinforcement learning (RL) stands out for its ability to achieve better than human performance in complex environments. While RL has been applied to activated sludge process optimization in existing literature, these applications are typically limited in scope, and never for the control of more than three actions. Expanding the scope of RL control has the potential to increase the optimization potential while concurrently reducing the number of control systems that must be tuned and maintained by operations staff. This study examined several facets of the implementation of multi-action, multi-objective RL agents, namely how many actions a single agent could successfully control and what extent of environment data was necessary to train such agents. This study observed improved control optimization with increasing action scope, though control of waste activated sludge remains a challenge. Furthermore, agents were able to maintain a high level of performance under decreased observation scope, up to a point. When compared to baseline control of the Benchmark Simulation Model No. 1 (BSM1), an RL agent controlling seven individual actions improved the average BSM1 performance metric by 8.3 %, equivalent to an annual cost savings of $40,200 after accounting for the cost of additional sensors.

引用

页数：10

共 48 条

[1]

Achiam J., 2020, Twin delayed DDPG

[2] Monitoring and control of biological nutrient removal in a Sequencing Batch Reactor [J].

Akin, BS ;

Ugurlu, A .

PROCESS BIOCHEMISTRY, 2005, 40 (08) :2873-2878

[3]

Alex J., 2008, BENCHMARK SIMULATION

[4] Intelligent Control of Wastewater Treatment Plants Based on Model-Free Deep Reinforcement Learning [J].

Aponte-Rengifo, Oscar ;

Francisco, Mario ;

Vilanova, Ramon ;

Vega, Pastora ;

Revollar, Silvana .

PROCESSES, 2023, 11 (08)

[5]

Bilgin E., 2020, Mastering Reinforcement Learning with Python

[6]

Bishop C.M., 1995, Neural networks for pattern recognition

[7]

Brockman Greg, 2016, arXiv

[8] Online sequential extreme learning machine based adaptive control for wastewater treatment plant [J].

Cao, Weiwei ;

Yang, Qinmin .

NEUROCOMPUTING, 2020, 408 :169-175

[9]

Chan S.C., 2019, Measuring the reliability of reinforcement learning algorithms. arXiv

[10] Optimal control towards sustainable wastewater treatment plants based on multi-agent reinforcement learning [J].

Chen, Kehua ;

Wang, Hongcheng ;

Valverde-Perez, Borja ;

Zhai, Siyuan ;

Vezzaro, Luca ;

Wang, Aijie .

CHEMOSPHERE, 2021, 279 (279)

← 1 2 3 4 5 →