Spatiotemporal modelling of airborne birch and grass pollen concentration across Switzerland: A comparison of statistical, machine learning and ensemble methods

被引:1
作者
Shokouhi, Behzad Valipour [1 ,2 ]
de Hoogh, Kees [1 ,2 ]
Gehrig, Regula [3 ]
Eeftens, Marloes [1 ,2 ]
机构
[1] Swiss Trop & Publ Hlth Inst, Allschwil, Switzerland
[2] Univ Basel, Basel, Switzerland
[3] Fed Off Meteorol & Climatol MeteoSwiss, Zurich, Switzerland
基金
欧洲研究理事会; 瑞士国家科学基金会;
关键词
Environmental stressors; Pollen; Machine learning; Land use regression; Spatiotemporal models; Exposure assessment; LAND-USE REGRESSION; POLLUTION; CORYLUS; ALNUS;
D O I
10.1016/j.envres.2024.119999
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Background: Statistical and machine learning models are commonly used to estimate spatial and temporal variability in exposure to environmental stressors, supporting epidemiological studies. We aimed to compare the performances, strengths and limitations of six different algorithms in the retrospective spatiotemporal modeling of daily birch and grass pollen concentrations at a spatial resolution of 1 km across Switzerland. Methods: Daily birch and grass pollen concentrations were available from 14 measurement sites in Switzerland for 2000-2019. To develop the spatiotemporal models, we considered spatiotemporal, spatial and temporal predictors including meteorological factors, land-use, elevation, species distribution and Normalized Difference Vegetation Index (NDVI). We used six statistical and machine learning algorithms: LASSO, Ridge, Elastic net, Random forest, XGBoost and ANNs. We optimized model structures through feature selection and grid search techniques to obtain the best predictive performance. We used train-test split and cross-validation to avoid overfitting and overoptimistic performance indicators. We then combined these six models through multiple linear regression to develop an ensemble hybrid model. Results: The 5(th)-95(th) percentiles of birch and grass pollen concentrations were 0-151 and 0-105 grains/m(3), respectively. The hybrid ensemble model achieved the best RMSE on the test dataset for both birch and grass pollen with 94.4 and 19.7 grains/m(3), respectively. Nonlinear models (Random forest, XGBoost and ANNs) achieved lower test RMSE's than linear models (LASSO, Ridge, Elastic net) for both pollen types, with RMSE's ranging from 105.9 to 140.5 grains/m(3) for birch and from 20.0 to 25.4 grains/m(3) for grass pollen. The Random forest algorithm yielded the best spatial and temporal performance among the six evaluated modelling methods. The ensemble hybrid model outperformed the six linear and nonlinear algorithms. Country-wide pollen concentration, land use, weather, and NDVI were important predictors. Conclusion: Nonlinear algorithms outperformed linear models and accurately explained complex, nonlinear relationships between environmental factors and measured concentrations.
引用
收藏
页数:11
相关论文
共 58 条
[21]   Modelling daily air temperature at a fine spatial resolution dealing with challenging meteorological phenomena and topography in Switzerland [J].
Flueckiger, Benjamin ;
Kloog, Itai ;
Ragettli, Martina S. ;
Eeftens, Marloes ;
Roosli, Martin ;
de Hoogh, Kees .
INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2022, 42 (12) :6413-6428
[22]   Influence of spatiotemporal and meteorological variation on Norwegian atmospheric pollen seasonality [J].
Frisk, Carl A. ;
Brobakk, Trond Einar ;
Rizzi, Jonathan ;
Ramfjord, Hallvard .
AGRICULTURAL AND FOREST METEOROLOGY, 2024, 353
[23]   Pollen monitoring: minimum requirements and reproducibility of analysis [J].
Galan, C. ;
Smith, M. ;
Thibaudon, M. ;
Frenguelli, G. ;
Oteros, J. ;
Gehrig, R. ;
Berger, U. ;
Clot, B. ;
Brandao, R. .
AEROBIOLOGIA, 2014, 30 (04) :385-395
[24]   Multi-decade changes in pollen season onset, duration, and intensity: A concern for public health? [J].
Glick, Sarah ;
Gehrig, Regula ;
Eeftens, Marloes .
SCIENCE OF THE TOTAL ENVIRONMENT, 2021, 781
[25]  
Guyon I., 2003, Journal of Machine Learning Research, V3, P1157, DOI 10.1162/153244303322753616
[26]   Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter [J].
Henderson, Sarah B. ;
Beckerman, Bernardo ;
Jerrett, Michael ;
Brauer, Michael .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2007, 41 (07) :2422-2428
[27]  
Hilaire D, 2012, AEROBIOLOGIA, V28, P499, DOI 10.1007/s10453-012-9252-4
[28]   Airborne pollen concentrations and daily mortality from respiratory and cardiovascular causes [J].
Jaakkola, Jouni J. K. ;
Kiihamaki, Simo-Pekka ;
Nayha, Simo ;
Ryti, Niilo R., I ;
Hugg, Timo T. ;
Jaakkola, Maritta S. .
EUROPEAN JOURNAL OF PUBLIC HEALTH, 2021, 31 (04) :722-724
[29]  
Kim KG, 2016, HEALTHC INFORM RES, V22, P351
[30]  
land.copernicus, EU-DEM-v1.1