Forecasting PM2.5 concentration levels using shallow machine learning models on the Monterrey Metropolitan Area in Mexico

被引:3
作者
Pozo-Luyo, Cesar Alejandro [1 ]
Cruz-Duarte, Jorge M. [1 ]
Amaya, Ivan [1 ]
Ortiz-Bayliss, Jose Carlos [1 ]
机构
[1] Tecnol Monterrey, Sch Engn & Sci, Ave Eugenio Garza Sada 2501, Monterrey 64700, Nuevo Leon, Mexico
关键词
Air quality forecasting; PM2.5; forecasting; Machine learning; Regression; METEOROLOGICAL CONDITIONS; AIR-QUALITY; EXPOSURE;
D O I
10.1016/j.apr.2023.101898
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Monterrey Metropolitan Area is one of the most densely populated and polluted regions in Latin America. Hence, providing early warnings to the population when pollutant concentrations reach high levels is critical. This allows people at higher health risk to make informed decisions about when to go out, mitigating future health complications. Using forecasting models, we can produce timely warnings for future concentration levels. In this work, we implement a set of short-term shallow machine learning models that would serve as a baseline for future forecasting analyses of PM2.5 concentration levels in the Monterrey Metropolitan Area. The proposed approach starts with multiple imputation through chained equations for missing value imputation, the incorporation of time metadata, and target winsorization. Then, we rely on the well-known random search for parameter optimization of the machine learning models and k-fold cross-validation, obtaining favorable results. We devise these models for a single-step and single-station analysis on an hourly multivariate air quality dataset (containing 77203 rows and 16 columns from the first hour of January 1, 2015 00:00:00 to April 17, 2022 23:00:00) and compare them using standard regression metrics. Therefore, we identify the forecasting model with the best performance, which was an Extra Trees Regressor with a Root Mean Squared Error of 0.013, a Mean Absolute Error of 0.006 (equivalent to a Mean Absolute Percentage Error of 0.294% and a Symmetric Mean Absolute Percentage Error of 0.078%), and a Maximum Error of 0.187 mu g/m(3).
引用
收藏
页数:11
相关论文
共 50 条
  • [41] A practical framework for predicting residential indoor PM2.5 concentration using land-use regression and machine learning methods
    Li, Zhiyuan
    Tong, Xinning
    Ho, Jason Man Wai
    Kwok, Timothy C. Y.
    Dong, Guanghui
    Ho, Kin-Fai
    Yim, Steve Hung Lam
    [J]. CHEMOSPHERE, 2021, 265 (265)
  • [42] High-resolution downscaling of source resolved PM2.5 predictions using machine learning models
    Dinkelacker, Brian T.
    Rivera, Pablo Garcia
    Marshall, Julian D.
    Adams, Peter J.
    Pandis, Spyros N.
    [J]. ATMOSPHERIC ENVIRONMENT, 2023, 310
  • [43] Modeling PM2.5 and PM10 Using a Robust Simplified Linear Regression Machine Learning Algorithm
    Gregorio, Joao
    Gouveia-Caridade, Carla
    Caridade, Pedro J. S. B.
    [J]. ATMOSPHERE, 2022, 13 (08)
  • [44] Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
    Czernecki, Bartosz
    Marosz, Michal
    Jedruszkiewicz, Joanna
    [J]. AEROSOL AND AIR QUALITY RESEARCH, 2021, 21 (07)
  • [45] Estimation of PM10 and PM2.5 Using Backscatter Coefficient of Ceilometer and Machine Learning
    Kim, Bu-Yo
    Cha, Joo Wan
    Lee, Yong Hee
    [J]. AEROSOL AND AIR QUALITY RESEARCH, 2023, 23 (12)
  • [46] Using a land use regression model with machine learning to estimate ground level PM2.5
    Wong, Pei-Yi
    Lee, Hsiao-Yun
    Chen, Yu-Cheng
    Zeng, Yu-Ting
    Chern, Yinq-Rong
    Chen, Nai-Tzu
    Lung, Shih-Chun Candice
    Su, Huey-Jen
    Wu, Chih-Da
    [J]. ENVIRONMENTAL POLLUTION, 2021, 277
  • [47] MERRA-2 PM2.5 mass concentration reconstruction in China mainland based on LightGBM machine learning
    Ma, Jinghui
    Zhang, Renhe
    Xu, Jianming
    Yu, Zhongqi
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2022, 827
  • [48] Using Machine Learning to Estimate Global PM2.5 for Environmental Health Studies
    Lary, D. J.
    Lary, T.
    Sattler, B.
    [J]. ENVIRONMENTAL HEALTH INSIGHTS, 2015, 9
  • [49] PM2.5 Forecasting Model Using a Combination of Deep Learning and Statistical Feature Selection
    Kristiani, Endah
    Kuo, Ting-Yu
    Yang, Chao-Tung
    Pai, Kai-Chih
    Huang, Chin-Yin
    Nguyen, Kieu Lan Phuong
    [J]. IEEE ACCESS, 2021, 9 : 68573 - 68582
  • [50] Machine-learning-based model and simulation analysis of PM2.5 concentration prediction in Beijing
    Qu Y.
    Qian X.
    Song H.-Q.
    He J.
    Li J.-H.
    Xiu H.
    [J]. Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2019, 41 (03): : 401 - 407