Spatial and spatiotemporal modelling of intra-urban ultrafine particles: A comparison of linear, nonlinear, regularized, and machine learning methods

被引:1
|
作者
Vachon, Julien [1 ,2 ,3 ]
Buteau, Stephane [1 ,2 ,3 ]
Liu, Ying [1 ]
Van Ryswyk, Keith [4 ]
Hatzopoulou, Marianne [5 ]
Smargiassi, Audrey [1 ,2 ,3 ]
机构
[1] Univ Montreal, Sch Publ Hlth, Dept Environm & Occupat Hlth, 7101 Av Parc,Local 3259, Montreal, PQ, Canada
[2] Univ Montreal, Ctr Publ Hlth Res CReSP, Montreal, PQ, Canada
[3] CIUSSS Ctr Sud Delile Demontreal, Montreal, PQ, Canada
[4] Hlth Canada, Water & Air Qual Bur, Air Pollut Exposure Sci Sect, Ottawa, ON, Canada
[5] Univ Toronto, Dept Civil Engn, Toronto, ON, Canada
基金
加拿大健康研究院; 加拿大自然科学与工程研究理事会;
关键词
Machine learning; Statistical methods; Ultrafine particles; Spatiotemporal modelling; Land use regression; Mobile monitoring; USE REGRESSION-MODELS; MOBILE; ROBUSTNESS; VALIDATION; EXPOSURE;
D O I
10.1016/j.scitotenv.2024.176523
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Background: Machine learning methods are proposed to improve the predictions of ambient air pollution, yet few studies have compared ultrafine particles (UFP) models across a broad range of statistical and machine learning approaches, and only one compared spatiotemporal models. Most reported marginal differences between methods. This limits our ability to draw conclusions about the best methods to model ambient UFPs. Objective: To compare the performance and predictions of statistical and machine learning methods used to model spatial and spatiotemporal ambient UFPs. Methods: Daily and annual models were developed from UFP measurements from a year-long mobile monitoring campaign in Quebec City, Canada, combined with 262 geospatial and six meteorological predictors. Various road segment lengths were considered (100/300/500 m) for UFP data aggregation. Four statistical methods included linear, non-linear, and regularized regressions, whereas eight machine learning regressions utilized tree-based, neural networks, support vector, and kernel ridge algorithms. Nested cross-validation was used for model training, hyperparameter tuning and performance evaluation. Results: Mean annual UFP concentrations was 13,335 particles/cm3. 3 . Machine learning outperformed statistical methods in predicting UFPs. Tree-based methods performed best across temporal scales and segment lengths, with XGBoost producing the overall best performing models (annual R-2 = 0.78-0.86, RMSE = 2163-2169 particles/cm(3) ; daily R-2 = 0.47-0.48, RMSE = 8651-11,422 particles/cm(3)). With 100 m segments, other annual models performed similarly well, but their prediction surfaces of annual mean UFP concentrations showed signs of overfitting. Spatial aggregation of monitoring data significantly impacted model performance. Longer segments yielded lower RMSE in all daily models and for annual statistical models, but not for annual machine learning models. Conclusions: The use of tree-based methods significantly improved spatiotemporal predictions of UFP concentrations, and to a lesser extent annual concentrations. Segment length and hyperparameter tuning had notable impacts on model performance and should be considered in future studies.
引用
收藏
页数:12
相关论文
共 4 条
  • [1] Intra-urban variation of ultrafine particles as evaluated by process related land use and pollutant driven regression modelling
    Ghassoun, Yahya
    Ruths, Matthias
    Loewner, Marc-Oliver
    Weber, Stephan
    SCIENCE OF THE TOTAL ENVIRONMENT, 2015, 536 : 150 - 160
  • [2] A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach
    Weichenthal, Scott
    Van Ryswyk, Keith
    Goldstein, Alon
    Bagg, Scott
    Shekkarizfard, Maryam
    Hatzopoulou, Marianne
    ENVIRONMENTAL RESEARCH, 2016, 146 : 65 - 72
  • [3] Spatiotemporal modelling of airborne birch and grass pollen concentration across Switzerland: A comparison of statistical, machine learning and ensemble methods
    Shokouhi, Behzad Valipour
    de Hoogh, Kees
    Gehrig, Regula
    Eeftens, Marloes
    ENVIRONMENTAL RESEARCH, 2024, 263
  • [4] A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide
    Chen, Jie
    de Hoogh, Kees
    Gulliver, John
    Hoffmann, Barbara
    Hertel, Ole
    Ketzel, Matthias
    Bauwelinck, Mariska
    van Donkelaar, Aaron
    Hvidtfeldt, Ulla A.
    Katsouyanni, Klea
    Janssen, Nicole A. H.
    Martin, Randall V.
    Samoli, Evangelia
    Schwartz, Per E.
    Stafoggia, Massimo
    Bellander, Tom
    Strak, Maciek
    Wolf, Kathrin
    Vienneau, Danielle
    Vermeulen, Roel
    Brunekreef, Bert
    Hoek, Gerard
    ENVIRONMENT INTERNATIONAL, 2019, 130