Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance

被引:153
作者
Bagherzadeh, Faramarz [1 ]
Mehrani, Mohamad-Javad [2 ]
Basirifard, Milad [3 ]
Roostaei, Javad [4 ]
机构
[1] Gdansk Univ Technol, Fac Mech Engn, Narutowicza St 11-12, PL-80233 Gdansk, Poland
[2] Gdansk Univ Technol, Fac Civil & Environm Engn, Narutowicza St 11-12, PL-80233 Gdansk, Poland
[3] Univ Tehran, Fac Engn, Dept Environm Engn, Enghelab Sq, Tehran, Iran
[4] Univ N Carolina, Dept Environm Sci & Engn, Chapel Hill, NC 27599 USA
关键词
Machine learning; ANN; RF; GBM; Feature selection; Total nitrogen; ARTIFICIAL NEURAL-NETWORK;
D O I
10.1016/j.jwpe.2021.102033
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Wastewater characteristics prediction in wastewater treatment plants (WWTPs) is valuable and can reduce the number of sampling, energy, and cost. Feature Selection (FS) methods are used in the pre-processing section for enhancing the model performance. This study aims to evaluate the effect of seven different FS methods (filter, wrapper, and embedded methods) on enhancing the prediction accuracy for total nitrogen (TN) in the WWTP influent flow. Four scenarios based on FS suggestions were defined and compared by three supervised Machine Learning (ML) algorithms, i.e. Artificial Neural Network (ANN), Random Forest (RF), and Gradient Boosting Machine (GBM). Input parameters, as daily time-series including pH, DO, COD, BOD, MLSS, MLVSS, NH4-N, and TN concentration, were used. Data set divided into train and unseen test data-sets, and performance precision of all models was carried out based on Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and correlation coefficient (R2). Results reveal that scenario IV which was suggested by Mutual Information, including NH4-N, COD, BOD, and DO had the best result rather than other FS methods. Furthermore, decision tree algorithms (RF and GBM) revealed better performance results in comparison to neural network algorithm (ANN). GBM generalized the dataset patterns very well and produced the best performance on unseen data-set, which shows the effectiveness of this state-of-the-art ML algorithm for wastewater components prediction.
引用
收藏
页数:7
相关论文
共 53 条
  • [1] Emerging evolutionary algorithm integrated with kernel principal component analysis for modeling the performance of a water treatment plant
    Abba, S., I
    Quoc Bao Pham
    Usman, A. G.
    Nguyen Thi Thuy Linh
    Aliyu, D. S.
    Quyen Nguyen
    Quang-Vu Bach
    [J]. JOURNAL OF WATER PROCESS ENGINEERING, 2020, 33
  • [2] Modeling and optimization of biogas production from a waste digester using artificial neural network and genetic algorithm
    Abu Qdais, H.
    Hani, K. Bani
    Shatnawi, N.
    [J]. RESOURCES CONSERVATION AND RECYCLING, 2010, 54 (06) : 359 - 363
  • [3] Alighardashi A., 2017, J MAT ENV SCI, V8, P2785
  • [4] [Anonymous], 2018, Guidelines for Drinking Water Quality
  • [5] Optimized fuzzy inference system to enhance prediction accuracy for in fluent characteristics of a sewage treatment plant
    Ansari, Mozafar
    Othman, Faridah
    El-Shafie, Ahmed
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2020, 722
  • [6] Prediction of sediment heavy metal at the Australian Bays using newly developed hybrid artificial intelligence models
    Bhagat, Suraj Kumar
    Tiyasha, Tiyasha
    Awadh, Salih Muhammad
    Tran Minh Tung
    Jawad, Ali H.
    Yaseen, Zaher Mundher
    [J]. ENVIRONMENTAL POLLUTION, 2021, 268
  • [7] A Simple Approach to Predicting the Reliability of Small Wastewater Treatment Plants
    Bunce, Joshua T.
    Graham, David W.
    [J]. WATER, 2019, 11 (11)
  • [8] Evaluation of feature selection methods based on artificial neural network weights
    da Costa, Nattane Luiza
    de Lima, Marcio Dias
    Barbosa, Rommel
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 168
  • [9] Determinants of efficiency in anaerobic bio-waste co-digestion facilities: A data envelopment analysis and gradient boosting approach
    De Clercq, Djavan
    Wen, Zongguo
    Fei, Fan
    [J]. APPLIED ENERGY, 2019, 253
  • [10] Dong G., 2018, Feature Engineering for Machine Learning and Data Analytics