Towards better process management in wastewater treatment plants: Process analytics based on SHAP values for tree-based machine learning methods

被引:163
作者
Wang, Dong [1 ]
Thunell, Sven [2 ]
Lindberg, Ulrika [2 ]
Jiang, Lili [3 ]
Trygg, Johan [1 ]
Tysklind, Mats [1 ]
机构
[1] Umea Univ, Dept Chem, SE-90187 Umea, Sweden
[2] Vakin, Ovagen 37, SE-90422 Umea, Sweden
[3] Umea Univ, Dept Comp Sci, SE-90187 Umea, Sweden
关键词
Wastewater treatment; Process analytics; Interpretable AI; Machine learning; SHapley additive exPlanations; ACTIVATED-SLUDGE; BIOLOGICAL REMOVAL; PHOSPHORUS REMOVAL;
D O I
10.1016/j.jenvman.2021.113941
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Understanding the mechanisms of pollutant removal in Wastewater Treatment Plants (WWTPs) is crucial for controlling effluent quality efficiently. However, the numerous treatment units, operational factors, and the underlying interactions between these units and factors usually obfuscate the comprehensive and precise understanding of the processes. We have previously proposed a machine learning (ML) framework to uncover complex cause-and-effect relationships in WWTPs. However, only one interpretable ML model, Random forest (RF), was studied and the interpretation method was not granular enough to reveal very detailed relationships between operational factors and effluent parameters. Thus, in this paper, we present an upgraded framework involving three interpretable tree-based models (RF, XGboost and LightGBM), three metrics (R-2, Root mean squared error (RMSE), and Mean absolute error (MAE)) and a more advanced interpretation system SHapley Additive exPlanations (SHAP). Details of the framework are provided along with a demonstration of its practical applicability based on a case study of the Umea WWTP in Sweden. Results show that, for both labels TSSe (Total suspended solids in effluent) and PO4(e) (Phosphate in effluent), the XGBoost models are optimal whereas the RF models are the least optimal, due to overfitting and polarized fitting. This study has yielded multiple new and significant findings with respect to the control of TSSe and PO4(e) in the Umea WWTP and other similarly configured WWTPs. Additionally, this study has produced two important generic findings relating to ML applications for WWTPs (or even other process industries) in terms of cause-and-effect investigations. First, the model comparison should be carried out from multiple perspectives to ensure that underlying details are fully revealed and examined. Second, using a precise, robust, and granular (feature attribution available for individual instances) explanation method can bring extra insight into both model comparison and model interpretation. SHAP is recommended as we found it to be of great value in this study.
引用
收藏
页数:12
相关论文
共 41 条
[1]   Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption [J].
Ahmad, Muhammad Waseem ;
Mourshed, Monjur ;
Rezgui, Yacine .
ENERGY AND BUILDINGS, 2017, 147 :77-89
[2]  
Bratby J, 2006, COAGULATION AND FLOCCULATION IN WATER AND WASTEWATER TREATMENT, 2ND EDITION, P1
[3]  
Breiman L., 2001, IEEE Trans. Broadcast., V45, P5
[4]  
Breiman L, 2002, MANUAL SETTING USING
[5]  
Chen T., ery and Data Mining, P785, DOI DOI 10.1145/2939672.2939785
[6]   A review: Driving factors and regulation strategies of microbial community structure and dynamics in wastewater treatment systems [J].
Chen, Yangwu ;
Lan, Shuhuan ;
Wang, Longhui ;
Dong, Shiyang ;
Zhou, Houzhen ;
Tan, Zhouliang ;
Li, Xudong .
CHEMOSPHERE, 2017, 174 :173-182
[7]   An approach to predict and forecast the price of constituents and index of cryptocurrency using machine learning [J].
Chowdhury, Reaz ;
Rahman, M. Arifur ;
Rahman, M. Sohel ;
Mahdy, M. R. C. .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2020, 551
[8]   BIOLOGICAL REMOVAL OF PHOSPHORUS FROM WASTEWATERS BY ALTERNATING AEROBIC AND ANAEROBIC CONDITIONS [J].
CONVERTI, A ;
ROVATTI, M ;
DELBORGHI, M .
WATER RESEARCH, 1995, 29 (01) :263-269
[9]   Removal of natural organic matter from source water: Review on coagulants, dual coagulation, alternative coagulants, and mechanisms [J].
Dayarathne, Narada ;
Angove, Michael J. ;
Aryal, Rupak ;
Abuel-Naga, Hossam ;
Mainali, Bandita .
JOURNAL OF WATER PROCESS ENGINEERING, 2021, 40
[10]  
Eckenfelder W.W., 1992, Activated sludge process design and control: theory and practice