A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection

被引:70
作者
Masmoudi, Sahar [1 ]
Elghazel, Haytham [2 ]
Taieb, Dalila [3 ]
Yazar, Orhan [2 ]
Kallel, Amjad [1 ]
机构
[1] Univ Sfax, Sfax Natl Sch Engn, Lab Water Energy & Environm Lab 3E, Sfax, Tunisia
[2] Univ Lyon 1, UMR 5205, LIRIS, Villeurbanne, France
[3] Natl Agcy Environm Protect, ANPE, Tunis, Tunisia
关键词
Air pollution; Machine learning; Multi-target regression (MTR); Feature ranking; Forecasting; SUPPORT VECTOR REGRESSION; POLLUTION PREDICTION; ENSEMBLES;
D O I
10.1016/j.scitotenv.2020.136991
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, the emergence of Machine Learning techniques justifies the application of statistical approaches for environmental modeling, especially in air quality forecasting. In this context, we propose a novel feature ranking method, termed as Ensemble of Regressor Chains-guided Feature Ranking (ERCFR) to forecast multiple air pollutants simultaneously over two cities. This approach is based on a combination of one of the most powerful ensemble methods for Multi-Target Regression problems (Ensemble of Regressor Chains) and the Random Forest permutation importance measure. Thus, feature selection allowed the model to obtain the best results with a restricted subset of features. The experimental results reveal the superiority of the proposed approach compared to other state-of-the-art methods, although some cautions have to be considered to improve the runtime performance and to decrease its sensitivity over extreme and outlier values. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:16
相关论文
共 37 条
[1]  
[Anonymous], SURVEY MULTIOUTPUT R
[2]  
[Anonymous], PREDICTING OZONE LAY
[3]  
[Anonymous], 2015, Journal of [] Environment Protection and Sustainable Development
[4]  
[Anonymous], BIG DATA COGN COMPUT
[5]  
[Anonymous], P INTELL PROD MACH S
[6]   Empirical study of feature selection methods based on individual feature evaluation for classification problems [J].
Arauzo-Azofra, Antonio ;
Aznarte, Jose Luis ;
Benitez, Jose M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (07) :8170-8177
[7]  
Barkia H., 2011, Proceedings of the 2011 IEEE 11th International Conference on Data Mining (ICDM 2011), P31, DOI 10.1109/ICDM.2011.129
[8]  
Breierova L., 2001, SYSTEM DYNAMICS ED P, P1
[9]   Empirical evaluation of feature selection methods in classification [J].
Cehovin, Luka ;
Bosnic, Zoran .
INTELLIGENT DATA ANALYSIS, 2010, 14 (03) :265-281
[10]   Air pollution prediction via multi-label classification [J].
Corani, Giorgio ;
Scanagatta, Mauro .
ENVIRONMENTAL MODELLING & SOFTWARE, 2016, 80 :259-264