A novel ensemble machine learning exposure model system for ground-level ozone at the national scale: A case of mainland China from 2013 to 2020

被引:2
作者
Wang, Jiawei [1 ]
机构
[1] Henan Normal Univ, Coll Environm, 46 Jianshe East Rd, Xinxiang 453007, Peoples R China
关键词
China; Ensemble method; Machine learning; Deep learning; Ozone pollution; Human exposure; SURFACE OZONE; SPATIOTEMPORAL PREDICTION; PARTICULATE MATTER; O-3; CONCENTRATIONS; RANDOM FOREST; POLLUTION; POLLUTANTS; PM2.5;
D O I
10.1016/j.eiar.2024.107630
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
In epidemiological research, accurate estimation of historical ground-level ozone (O3) 3 ) concentrations with enhanced spatiotemporal resolution is crucial for effective exposure assessment. The current state-of-the-art for estimating air pollutant concentrations is a two-stage ensemble method that integrates outputs from multiple machine learning algorithms. Despite its effectiveness, opportunities exist to refine this approach for more precise O3 3 estimation. In this study, we propose an enhanced ensemble method that incorporates four key strategies. First, we employ high-resolution spatiotemporal predictors derived from prior machine learning studies for refined secondary learning. Second, we use sophisticated algorithms, including categorical gradient boosting, deep neural network, random forest, stochastic variable Gaussian process, transformer, and a combination of convolutional neural network and long short-term memory neural network, as sublearners to enhance learning capabilities. Third, we spatiotemporally split the sample set and then train submodels separately on each subset to eliminate the unobserved spatiotemporal heterogeneity. Finally, we apply a complex machine learning algorithm, rather than the generalized additive model, for integrating sublearner predictions, enabling the capture of intricate nonlinear relationships beyond basic spatiotemporal linear weights. To validate these improvements, we estimated daily maximum 8-h moving average O3 3 concentrations ([O3]MDA8) 3 ]MDA8) across Chinese mainland from 2013 to 2020 at a 1 km spatial resolution. The proposed method demonstrated notable accuracy, achieving an out-of-station determination coefficient (R2) 2 ) of 0.943 and a root-mean-square error (RMSE) of 10.197 mu g/m3. 3 . This performance marks a nearly 15% improvement over the best existing Chinese O3 3 exposure model based on a single algorithm and also surpasses previous studies utilizing traditional ensemble methods for other air pollutants. Our enhanced ensemble approach significantly bolsters the reliability and robustness of future environmental epidemiological studies by further mitigating "misclassification" errors.
引用
收藏
页数:13
相关论文
共 78 条
[1]  
Abadi M, 2016, Arxiv, DOI arXiv:1605.08695
[2]  
Ba J, 2014, ACS SYM SER
[3]  
Breiman L., 2001, MACH LEARN, V45, P5, DOI https://doi.org/10.1023/A:1010933404324
[4]  
CARB, 2006, Procedure for Calculating State 8-Hour Ozone Concentrations
[5]   Improving satellite-based estimation of surface ozone across China during 2008?2019 using iterative random forest model and high-resolution grid meteorological data [J].
Chen, Gongbo ;
Chen, Jiang ;
Dong, Guang-hui ;
Yang, Bo-yi ;
Liu, Yisi ;
Lu, Tianjun ;
Yu, Pei ;
Guo, Yuming ;
Li, Shanshan .
SUSTAINABLE CITIES AND SOCIETY, 2021, 69
[6]   Temporal and Spatial Features of the Correlation between PM2.5 and O3 Concentrations in China [J].
Chen, Jiajia ;
Shen, Huanfeng ;
Li, Tongwen ;
Peng, Xiaolin ;
Cheng, Hairong ;
Ma, Chenyan .
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2019, 16 (23)
[7]   A hybrid approach to estimating long-term and short-term exposure levels of ozone at the national scale in China using land use regression and Bayesian maximum entropy [J].
Chen, Li ;
Liang, Shuang ;
Li, Xiaoli ;
Mao, Jian ;
Gao, Shuang ;
Zhang, Hui ;
Sun, Yanling ;
Vedal, Sverre ;
Bai, Zhipeng ;
Ma, Zhenxing ;
Haiyu ;
Azzi, Merched .
SCIENCE OF THE TOTAL ENVIRONMENT, 2021, 752
[8]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[9]   Global PM2.5 Prediction and Associated Mortality to 2100 under Different Climate Change Scenarios [J].
Chen, Wanying ;
Lu, Xingcheng ;
Yuan, Dehao ;
Chen, Yiang ;
Li, Zhenning ;
Huang, Yeqi ;
Fung, Tung ;
Sun, Haochen ;
Fung, Jimmy C. H. .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2023, 57 (27) :10039-10052
[10]  
Chen X., 2023, J. Geophys. Res.-Atmos., V128