Short-Term PM2.5 Concentration Changes Prediction: A Comparison of Meteorological and Historical Data

被引:5
作者
Kang, Junfeng [1 ]
Zou, Xinyi [1 ]
Tan, Jianlin [1 ]
Li, Jun [2 ]
Karimian, Hamed [1 ,3 ]
机构
[1] Jiangxi Univ Sci & Technol, Sch Civil & Surveying & Mapping Engn, Ganzhou 341000, Peoples R China
[2] Guangdong Sci & Technol Infrastruct Ctr, Guangzhou 510033, Peoples R China
[3] Jiangsu Ocean Univ, Sch Marine Technol & Geomat, Lianyungang 222005, Peoples R China
基金
中国国家自然科学基金;
关键词
PM2; 5; prediction; machine learning; stacking; meteorological factor; NEURAL-NETWORK; ADABOOST ALGORITHM; AIR-POLLUTION; MODEL; CHINA; LIGHTGBM; CITIES;
D O I
10.3390/su151411408
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Machine learning is being extensively employed in the prediction of PM2.5 concentrations. This study aims to compare the prediction accuracy of machine learning models for short-term PM2.5 concentration changes and to find a universal and robust model for both hourly and daily time scales. Five commonly used machine learning models were constructed, along with a stacking model consisting of Multivariable Linear Regression (MLR) as the meta-learner and the ensemble of Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) as the base learner models. The meteorological datasets and historical PM2.5 concentration data with meteorological datasets were preprocessed and used to evaluate the model's accuracy and stability across different time scales, including hourly and daily, using the coefficient of determination (R-2), Root-Mean-Square Error (RMSE), and Mean Absolute Error (MAE). The results show that historical PM2.5 concentration data are crucial for the prediction precision of the machine learning models. Specifically, on the meteorological datasets, the stacking model, XGboost, and RF had better performance for hourly prediction, and the stacking model, XGboost and LightGBM had better performance for daily prediction. On the historical PM2.5 concentration data with meteorological datasets, the stacking model, LightGBM, and XGboost had better performance for hourly and daily datasets. Consequently, the stacking model outperformed individual models, with the XGBoost model being the best individual model to predict the PM2.5 concentration based on meteorological data, and the LightGBM model being the best individual model to predict the PM2.5 concentration using historical PM2.5 data with meteorological datasets.
引用
收藏
页数:24
相关论文
共 83 条
[1]   A-Stacking and A-Bagging: Adaptive versions of ensemble learning algorithms for spoof fingerprint detection [J].
Agarwal, Shivang ;
Chowdary, C. Ravindranath .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 146
[2]   A Geographically and Temporally Weighted Regression Model for Ground-Level PM2.5 Estimation from Satellite-Derived 500 m Resolution AOD [J].
Bai, Yang ;
Wu, Lixin ;
Qin, Kai ;
Zhang, Yufeng ;
Shen, Yangyang ;
Zhou, Yuan .
REMOTE SENSING, 2016, 8 (03)
[3]   Hourly PM2.5 concentration forecast using stacked autoencoder model with emphasis on seasonality [J].
Bai, Yun ;
Li, Yong ;
Zeng, Bo ;
Li, Chuan ;
Zhang, Jin .
JOURNAL OF CLEANER PRODUCTION, 2019, 224 :739-750
[4]   An ensemble long short-term memory neural network for hourly PM2.5 concentration forecasting [J].
Bai, Yun ;
Zeng, Bo ;
Li, Chuan ;
Zhang, Jin .
CHEMOSPHERE, 2019, 222 :286-294
[5]   The Modulation of Meteorological Parameters on Surface PM2.5 and O3 Concentrations in Guangzhou, China [J].
Bu, Qiaoli ;
Hong, Yingying ;
Tan, Haobo ;
Liu, Li ;
Wang, Chunlin ;
Zhu, Jianjun ;
Chan, Pakwai ;
Chen, Chen .
AEROSOL AND AIR QUALITY RESEARCH, 2021, 21 (01) :1-18
[6]   Environmental political business cycles: the case of PM2.5 air pollution in Chinese prefectures [J].
Cao, Xun ;
Kostka, Genia ;
Xu, Xu .
ENVIRONMENTAL SCIENCE & POLICY, 2019, 93 :92-100
[7]   An LSTM-based aggregated model for air pollution forecasting [J].
Chang, Yue-Shan ;
Chiao, Hsin-Ta ;
Abimannan, Satheesh ;
Huang, Yo-Ping ;
Tsai, Yi-Ting ;
Lin, Kuan-Ming .
ATMOSPHERIC POLLUTION RESEARCH, 2020, 11 (08) :1451-1463
[8]   LightGBM-PPI: Predicting protein-protein interactions through LightGBM with multi-information fusion [J].
Chen, Cheng ;
Zhang, Qingmei ;
Ma, Qin ;
Yu, Bin .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 191 :54-64
[9]   A machine learning method to estimate PM2.5 concentrations across China with remote sensing, meteorological and land use information [J].
Chen, Gongbo ;
Li, Shanshan ;
Knibbs, Luke D. ;
Hamm, N. A. S. ;
Cao, Wei ;
Li, Tiantian ;
Guo, Jianping ;
Ren, Hongyan ;
Abramson, Michael J. ;
Guo, Yuming .
SCIENCE OF THE TOTAL ENVIRONMENT, 2018, 636 :52-60
[10]   Stacking machine learning model for estimating hourly PM2.5 in China based on Himawari 8 aerosol optical depth data [J].
Chen, Jiangping ;
Yin, Jianhua ;
Zang, Lin ;
Zhang, Taixin ;
Zhao, Mengdi .
SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 697