Forecasting PM2.5 concentration levels using shallow machine learning models on the Monterrey Metropolitan Area in Mexico

被引:3
作者
Pozo-Luyo, Cesar Alejandro [1 ]
Cruz-Duarte, Jorge M. [1 ]
Amaya, Ivan [1 ]
Ortiz-Bayliss, Jose Carlos [1 ]
机构
[1] Tecnol Monterrey, Sch Engn & Sci, Ave Eugenio Garza Sada 2501, Monterrey 64700, Nuevo Leon, Mexico
关键词
Air quality forecasting; PM2.5; forecasting; Machine learning; Regression; METEOROLOGICAL CONDITIONS; AIR-QUALITY; EXPOSURE;
D O I
10.1016/j.apr.2023.101898
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Monterrey Metropolitan Area is one of the most densely populated and polluted regions in Latin America. Hence, providing early warnings to the population when pollutant concentrations reach high levels is critical. This allows people at higher health risk to make informed decisions about when to go out, mitigating future health complications. Using forecasting models, we can produce timely warnings for future concentration levels. In this work, we implement a set of short-term shallow machine learning models that would serve as a baseline for future forecasting analyses of PM2.5 concentration levels in the Monterrey Metropolitan Area. The proposed approach starts with multiple imputation through chained equations for missing value imputation, the incorporation of time metadata, and target winsorization. Then, we rely on the well-known random search for parameter optimization of the machine learning models and k-fold cross-validation, obtaining favorable results. We devise these models for a single-step and single-station analysis on an hourly multivariate air quality dataset (containing 77203 rows and 16 columns from the first hour of January 1, 2015 00:00:00 to April 17, 2022 23:00:00) and compare them using standard regression metrics. Therefore, we identify the forecasting model with the best performance, which was an Extra Trees Regressor with a Root Mean Squared Error of 0.013, a Mean Absolute Error of 0.006 (equivalent to a Mean Absolute Percentage Error of 0.294% and a Symmetric Mean Absolute Percentage Error of 0.078%), and a Maximum Error of 0.187 mu g/m(3).
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Evaluation of Machine Learning Models for Estimating PM2.5 Concentrations across Malaysia
    Zaman, Nurul Amalin Fatihah Kamarul
    Kanniah, Kasturi Devi
    Kaskaoutis, Dimitris G.
    Latif, Mohd Talib
    APPLIED SCIENCES-BASEL, 2021, 11 (16):
  • [32] Combining machine learning models through multiple data division methods for PM2.5 forecasting in Northern Xinjiang, China
    Miaomiao Ren
    Wei Sun
    Shu Chen
    Environmental Monitoring and Assessment, 2021, 193
  • [33] Data-driven predictive modeling of PM2.5 concentrations using machine learning and deep learning techniques: a case study of Delhi, India
    Masood, Adil
    Ahmad, Kafeel
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2023, 195 (01)
  • [34] Efficient PM2.5 forecasting using geographical correlation based on integrated deep learning algorithms
    Yeo, Inchoon
    Choi, Yunsoo
    Lops, Yannic
    Sayeed, Alqamah
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (22) : 15073 - 15089
  • [35] Combining machine learning models through multiple data division methods for PM2.5 forecasting in Northern Xinjiang, China
    Ren, Miaomiao
    Sun, Wei
    Chen, Shu
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2021, 193 (08)
  • [36] Forecasting PM2.5 concentration using artificial neural network and its health effects in Ahvaz, Iran
    Goudarzi, Gholamreza
    Hopke, Philip K.
    Yazdani, Mohsen
    CHEMOSPHERE, 2021, 283
  • [37] Forecasting PM2.5 Concentration in India Using a Cluster Based Hybrid Graph Neural Network Approach
    Ejurothu, Pavan Sai Santhosh
    Mandal, Subhojit
    Thakur, Mainak
    ASIA-PACIFIC JOURNAL OF ATMOSPHERIC SCIENCES, 2023, 59 (05) : 545 - 561
  • [38] ASSESSMENT OF COVID-19 LOCKDOWN IMPACT ON PM10 AND PM2.5 CONCENTRATIONS IN THE MEXICO CITY METROPOLITAN AREA
    Mendez-astudillo, Jorge
    REVISTA INTERNACIONAL DE CONTAMINACION AMBIENTAL, 2023, 39 : 295 - 306
  • [39] Comparing multiple machine learning models to investigate the relationship between urban morphology and PM2.5 based on mobile monitoring
    Zhang, Jianfeng
    Wan, Yang
    Tian, Meng
    Li, Hao
    Chen, Keyan
    Xu, Xuesong
    Yuan, Lei
    BUILDING AND ENVIRONMENT, 2024, 248
  • [40] Comparative analysis of machine learning models for predicting PM2.5 concentrations using meteorological and chemical indicators
    Haseeb, Muhammad
    Tahir, Zainab
    Mahmood, Syed Amer
    Arif, Hania
    Almutairi, Khalid F.
    Soufan, Walid
    Tariq, Aqil
    JOURNAL OF ATMOSPHERIC AND SOLAR-TERRESTRIAL PHYSICS, 2024, 263