Temporally boosting neural network for improving dynamic prediction of PM2.5 concentration with changing and unbalanced distribution

被引:0
作者
Shi, Haoze [1 ]
Yang, Xin [1 ]
Tang, Hong [1 ]
Tu, Yuhong [1 ]
机构
[1] Beijing Normal Univ, Fac Geog Sci, State Key Lab Remote Sensing Sci, Beijing 100875, Peoples R China
基金
中国国家自然科学基金;
关键词
PM2.5; prediction; Machine learning; Concept drift; Imbalanced regression; AIR-POLLUTION; CHINA; TRANSPORT; MODEL;
D O I
10.1016/j.jenvman.2025.125371
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Increasing medical research evidence suggests that even low PM2.5 concentrations may trigger significant health issues. Hence, an accurate prediction of PM2.5 holds immense significance in securing public health safety. However, current data-drive predictive methods exhibit seasonal model performance decline and difficulties in predicting extremely high values. Those issues may stem from neglecting two crucial features in PM2.5 data streams, i.e., concept drift and imbalanced distribution. In this study, we validate this hypothesis by conducting an in-depth analysis of the characteristics of the PM2.5 data stream and the prediction errors of three mainstream models trained on this PM2.5 data stream, i.e., random forest, convolutional neural network and transformer. Based on the identified types of concept drift and the patterns of imbalanced distribution, we introduce the Temporally boosting neural network (Temp-boost), a novel ensemble learning method designed to enhance predictive accuracy by integrating static and dynamic models. Static models, which are trained on balanced historical datasets, typically receive infrequent updates. Conversely, dynamic models are trained on newly arrived data and undergo more frequent updates. We evaluated the performance of Temp-boost and the three mentioned models in predicting gridded PM2.5 concentrations across the North China Plain in 2019. Compared to the three models, the Temp-boost shows improved prediction accuracy for different seasons, with notable enhancements in high-pollution levels. Specifically, for pollution levels above lightly polluted, the Temp-boost effectively reduces the average MAE by 13.22 mu g m-3, RMSE by 13.32 mu g m-3 , with reductions peaking MAE at 26.45 mu g m-3,RMSE at 25.76 mu g m-3 in more severe case.
引用
收藏
页数:18
相关论文
共 102 条
[1]   Concept Drift Detection in Data Stream Mining : A literature review [J].
Agrahari, Supriya ;
Singh, Anil Kumar .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) :9523-9540
[2]   Transformers in Remote Sensing: A Survey [J].
Aleissaee, Abdulaziz Amer ;
Kumar, Amandeep ;
Anwer, Rao Muhammad ;
Khan, Salman ;
Cholakkal, Hisham ;
Xia, Gui-Song ;
Khan, Fahad Shahbaz .
REMOTE SENSING, 2023, 15 (07)
[3]  
[Anonymous], 2021, WHO Global Air Quality Guidelines: Particulate Matter (PM2.5 and PM10), Ozone, Nitrogen Dioxide, Sulfur Dioxide and Carbon Monoxide
[4]   Ambient PM2.5 Reduces Global and Regional Life Expectancy [J].
Apte, Joshua S. ;
Brauer, Michael ;
Cohen, Aaron J. ;
Ezzati, Majid ;
Pope, C. Arden, III .
ENVIRONMENTAL SCIENCE & TECHNOLOGY LETTERS, 2018, 5 (09) :546-551
[5]  
Bai Kaixu, 2021, Zenodo, DOI 10.5281/ZENODO.5652265
[6]   From concept drift to model degradation: An overview on performance-aware drift detectors [J].
Bayram, Firas ;
Ahmed, Bestoun S. ;
Kassler, Andreas .
KNOWLEDGE-BASED SYSTEMS, 2022, 245
[7]   Random forest in remote sensing: A review of applications and future directions [J].
Belgiu, Mariana ;
Dragut, Lucian .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2016, 114 :24-31
[8]   Combining Machine Learning and Numerical Simulation for High-Resolution PM2.5 Concentration Forecast [J].
Bi, Jianzhao ;
Knowland, K. Emma ;
Keller, Christoph A. ;
Liu, Yang .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2022, 56 (03) :1544-1556
[9]  
Bi KF, 2022, Arxiv, DOI [arXiv:2211.02556, DOI 10.48550/ARXIV.2211.02556]
[10]  
Bifet A, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P443