An ensemble learning based hybrid model and framework for air pollution forecasting

被引:53
作者
Chang, Yue-Shan [1 ]
Abimannan, Satheesh [2 ]
Chiao, Hsin-Ta [3 ]
Lin, Chi-Yeh [1 ]
Huang, Yo-Ping [4 ]
机构
[1] Natl Taipei Univ, Dept Comp Sci & Informat Engn, New Taipei, Taiwan
[2] Galgotias Univ, Greater Noida, Uttar Pradesh, India
[3] Tunghai Univ, Taichung, Taiwan
[4] Natl Taipei Univ Technol, Taipei, Taiwan
关键词
Air pollution forecasting; Ensemble learning; LSTM; Pearson correlation coefficient; PM2; 5; SVR; GBTR; ARTIFICIAL NEURAL-NETWORKS; PM2.5; CONCENTRATIONS; QUALITY PREDICTION; PM10;
D O I
10.1007/s11356-020-09855-1
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
As advance of economy and industry, the impact of air pollution has gradually gained attention. In order to predict air quality, there were many studies that exploited various machine learning techniques to build predictive model for pollutant concentration or air quality prediction. However, enhancing the prediction performance always is the common problem of existing studies. Traditional templates based on machine learning and deep learning methods, such as GBTR (gradient boosted tree regression), SVR (support vector machine-based regression), and LSTM (long short-term memory), are most promising approaches to address these problems. Some previous researches showed that ensemble learning technology can improve predictive performance of other domains. In order to improve the accuracy of forecasting, in this paper, we propose a hybrid model and framework to improve the forecasting accuracy of air pollution. We not only exploit stacking-based ensemble learning scheme with Pearson correlation coefficient to calculate the correlation between different machine learning models to integrate various forecasting models together, but also construct a framework based on Spark+Hadoop machine learning and TensorFlow deep learning framework to physically integrate these models to demonstrate the next 1 to 8 h' air pollution forecasting. We also conduct experiments and compare the result with GBTR, SVR, LSTM, and LSTM2 (version 2) models to demonstrate the proposed hybrid model's predictive performance. The experimental results show that the hybrid model is superior to the existing models used for predicting air pollution.
引用
收藏
页码:38155 / 38168
页数:14
相关论文
共 55 条
[2]  
[Anonymous], 2019, How Air Pollution is Destroying our Lungs
[3]  
[Anonymous], 2010 INT JOINT C NEU, DOI [10.1109/IJCNN.2010.5596900, DOI 10.1109/IJCNN.2010.5596900]
[4]  
[Anonymous], 1997, Machine Learning
[5]   Air Pollution Forecasts: An Overview [J].
Bai, Lu ;
Wang, Jianzhou ;
Ma, Xuejiao ;
Lu, Haiyan .
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2018, 15 (04)
[6]   Forecast of daily PM2.5 concentrations applying artificial neural networks and Holt-Winters models [J].
Baptista Ventura, Luciana Maria ;
Pinto, Fellipe de Oliveira ;
Soares, Laiza Molezon ;
Luna, Aderval S. ;
Gioda, Adriana .
AIR QUALITY ATMOSPHERE AND HEALTH, 2019, 12 (03) :317-325
[7]  
Behera RN., 2016, Int J Comput Appl, V146, P31
[8]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[9]  
Chang Y., 2010, J. Mach. Learn. Res., V11, P1471
[10]   An LSTM-based aggregated model for air pollution forecasting [J].
Chang, Yue-Shan ;
Chiao, Hsin-Ta ;
Abimannan, Satheesh ;
Huang, Yo-Ping ;
Tsai, Yi-Ting ;
Lin, Kuan-Ming .
ATMOSPHERIC POLLUTION RESEARCH, 2020, 11 (08) :1451-1463