Systematic Performance Evaluation of Reinforcement Learning Algorithms Applied to Wastewater Treatment Control Optimization

被引：27

作者：

Croll, Henry C. C. ^{[1
]}

Ikuma, Kaoru ^{[1
]}

Ong, Say Kee ^{[1
]}

Sarkar, Soumik ^{[2
]}

机构：

[1] Iowa State Univ, Dept Civil Construct & Environm Engn, Ames, IA 50011 USA

[2] Iowa State Univ, Dept Mech Engn, Ames, IA 50011 USA

来源：

ENVIRONMENTAL SCIENCE & TECHNOLOGY | 2023年 / 57卷 / 46期

关键词：

machine learning; operations; control optimization; energy efficiency; BSM1; twin delayed deepdeterministic policy gradient (TD3); SOLIDS RETENTION TIME; TREATMENT PLANTS; BENCHMARK; BIOSOLIDS; AERATION; DESIGN;

D O I：

10.1021/acs.est.3c00353

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Treatment of wastewater using activated sludge relieson severalcomplex, nonlinear processes. While activated sludge systems can providehigh levels of treatment, including nutrient removal, operating thesesystems is often challenging and energy intensive. Significant researchinvestment has been made in recent years into improving control optimizationof such systems, through both domain knowledge and, more recently, machine learning. This study leverages a novel interface betweena common process modeling software and a Python reinforcement learningenvironment to evaluate four common reinforcement learning algorithmsfor their ability to minimize treatment energy use while maintainingeffluent compliance within the Benchmark Simulation Model No. 1 (BSM1)simulation. Three of the algorithms tested, deep Q-learning, proximalpolicy optimization, and synchronous advantage actor critic, generallyperformed poorly over the scenarios tested in this study. In contrast,the twin delayed deep deterministic policy gradient (TD3) algorithmconsistently produced a high level of control optimization while maintainingthe treatment requirements. Under the best selection of state observationfeatures, TD3 control optimization reduced aeration and pumping energyrequirements by 14.3% compared to the BSM1 benchmark control, outperformingthe advanced domain-based control strategy of ammonia-based aerationcontrol, although future work is necessary to improve robustness ofRL implementation. Reinforcementlearning can provide better than human controloptimization. This study evaluates reinforcement learning controloptimization strategies to reduce wastewater treatment energy costswhile maintaining effluent compliance.

引用

页码：18382 / 18390

页数：9

共 59 条

[1]

Achiam J., Twin delayed ddpg

[2]

Alex J., 2008, Report by the IWA Taskgroup on benchmarking of control strategies for WWTPs 1

[3]

Balaji B, 2020, IEEE INT CONF ROBOT, P2746, DOI [10.1109/ICRA40945.2020.9197465, 10.1109/icra40945.2020.9197465]

[4]

Bharadhwaj Homanga., 2020, arXiv

[5]

Bilgin E.., 2020, MASTERING REINFORCEM, P1

[6]

Bishop C. M., 1995, NEURAL NETWORKS FORP, P1

[7]

Brockman G, 2016, Arxiv, DOI arXiv:1606.01540

[8] Online sequential extreme learning machine based adaptive control for wastewater treatment plant [J].

Cao, Weiwei ;

Yang, Qinmin .

NEUROCOMPUTING, 2020, 408 :169-175

[9] Optimal control towards sustainable wastewater treatment plants based on multi-agent reinforcement learning [J].

Chen, Kehua ;

Wang, Hongcheng ;

Valverde-Perez, Borja ;

Zhai, Siyuan ;

Vezzaro, Luca ;

Wang, Aijie .

CHEMOSPHERE, 2021, 279 (279)

[10] Optimal design activated sludge process by means of multi-objective optimization: case study in Benchmark Simulation Model 1 (BSM1) [J].

Chen, Wenliang ;

Yao, Chonghua ;

Lu, Xiwu .

WATER SCIENCE AND TECHNOLOGY, 2014, 69 (10) :2052-2058

← 1 2 3 4 5 6 →