Spatial and spatiotemporal modelling of intra-urban ultrafine particles: A comparison of linear, nonlinear, regularized, and machine learning methods

被引：1

作者：

Vachon, Julien ^{[1
,2
,3
]}

Buteau, Stephane ^{[1
,2
,3
]}

Liu, Ying ^{[1
]}

Van Ryswyk, Keith ^{[4
]}

Hatzopoulou, Marianne ^{[5
]}

Smargiassi, Audrey ^{[1
,2
,3
]}

机构：

[1] Univ Montreal, Sch Publ Hlth, Dept Environm & Occupat Hlth, 7101 Av Parc,Local 3259, Montreal, PQ, Canada

[2] Univ Montreal, Ctr Publ Hlth Res CReSP, Montreal, PQ, Canada

[3] CIUSSS Ctr Sud Delile Demontreal, Montreal, PQ, Canada

[4] Hlth Canada, Water & Air Qual Bur, Air Pollut Exposure Sci Sect, Ottawa, ON, Canada

[5] Univ Toronto, Dept Civil Engn, Toronto, ON, Canada

来源：

SCIENCE OF THE TOTAL ENVIRONMENT | 2024年 / 954卷

基金：

加拿大健康研究院; 加拿大自然科学与工程研究理事会;

关键词：

Machine learning; Statistical methods; Ultrafine particles; Spatiotemporal modelling; Land use regression; Mobile monitoring; USE REGRESSION-MODELS; MOBILE; ROBUSTNESS; VALIDATION; EXPOSURE;

D O I：

10.1016/j.scitotenv.2024.176523

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Background: Machine learning methods are proposed to improve the predictions of ambient air pollution, yet few studies have compared ultrafine particles (UFP) models across a broad range of statistical and machine learning approaches, and only one compared spatiotemporal models. Most reported marginal differences between methods. This limits our ability to draw conclusions about the best methods to model ambient UFPs. Objective: To compare the performance and predictions of statistical and machine learning methods used to model spatial and spatiotemporal ambient UFPs. Methods: Daily and annual models were developed from UFP measurements from a year-long mobile monitoring campaign in Quebec City, Canada, combined with 262 geospatial and six meteorological predictors. Various road segment lengths were considered (100/300/500 m) for UFP data aggregation. Four statistical methods included linear, non-linear, and regularized regressions, whereas eight machine learning regressions utilized tree-based, neural networks, support vector, and kernel ridge algorithms. Nested cross-validation was used for model training, hyperparameter tuning and performance evaluation. Results: Mean annual UFP concentrations was 13,335 particles/cm3. 3 . Machine learning outperformed statistical methods in predicting UFPs. Tree-based methods performed best across temporal scales and segment lengths, with XGBoost producing the overall best performing models (annual R-2 = 0.78-0.86, RMSE = 2163-2169 particles/cm(3) ; daily R-2 = 0.47-0.48, RMSE = 8651-11,422 particles/cm(3)). With 100 m segments, other annual models performed similarly well, but their prediction surfaces of annual mean UFP concentrations showed signs of overfitting. Spatial aggregation of monitoring data significantly impacted model performance. Longer segments yielded lower RMSE in all daily models and for annual statistical models, but not for annual machine learning models. Conclusions: The use of tree-based methods significantly improved spatiotemporal predictions of UFP concentrations, and to a lesser extent annual concentrations. Segment length and hyperparameter tuning had notable impacts on model performance and should be considered in future studies.

引用

页数：12

共 4 条

[1] Intra-urban variation of ultrafine particles as evaluated by process related land use and pollutant driven regression modelling
Ghassoun, Yahya
Ruths, Matthias
Loewner, Marc-Oliver
Weber, Stephan
SCIENCE OF THE TOTAL ENVIRONMENT, 2015, 536 : 150 - 160
[2] A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach
Weichenthal, Scott
Van Ryswyk, Keith
Goldstein, Alon
Bagg, Scott
Shekkarizfard, Maryam
Hatzopoulou, Marianne
ENVIRONMENTAL RESEARCH, 2016, 146 : 65 - 72
[3] Spatiotemporal modelling of airborne birch and grass pollen concentration across Switzerland: A comparison of statistical, machine learning and ensemble methods
Shokouhi, Behzad Valipour
de Hoogh, Kees
Gehrig, Regula
Eeftens, Marloes
ENVIRONMENTAL RESEARCH, 2024, 263
[4] A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide
Chen, Jie
de Hoogh, Kees
Gulliver, John
Hoffmann, Barbara
Hertel, Ole
Ketzel, Matthias
Bauwelinck, Mariska
van Donkelaar, Aaron
Hvidtfeldt, Ulla A.
Katsouyanni, Klea
Janssen, Nicole A. H.
Martin, Randall V.
Samoli, Evangelia
Schwartz, Per E.
Stafoggia, Massimo
Bellander, Tom
Strak, Maciek
Wolf, Kathrin
Vienneau, Danielle
Vermeulen, Roel
Brunekreef, Bert
Hoek, Gerard
ENVIRONMENT INTERNATIONAL, 2019, 130

← 1 →