Application and Evaluation of Machine Learning Techniques for Real-time Short-term Prediction of Air Pollutants

被引:2
作者
Kim, Yeong-Il [1 ,3 ]
Lee, Kwon-Ho [2 ,3 ]
Park, Seung-Han [1 ,3 ]
机构
[1] Gangneung Wonju Natl Univ, Spatial Informat Cooperat Program, Kangnung, South Korea
[2] Gangneung Wonju Natl Univ, Dept Atmospher & Environm Sci, Kangnung, South Korea
[3] Gangneung Wonju Natl Univ, Res Inst Radiat Satellite, Kangnung, South Korea
关键词
Air quality; Machine learning; PM; Air pollutants; R PACKAGE;
D O I
10.5572/KOSAE.2023.39.1.107
中图分类号
P4 [大气科学(气象学)];
学科分类号
0706 ; 070601 ;
摘要
In this study, the machine learning (ML) techniques were compared and evaluated for real-time short-term prediction of air pollutants and the accuracy of the prediction results using the optimal prediction technique was analyzed. Air quality data and meteorological data for the last four years (2015 similar to 2018) are used to train and test the ML system. The ML system consists of four models including Random Forest (RF), Support Vector Machine (SVM), Multiple Linear Regression (MLR), and Deep Neural Network (DNN), and the optimal model was determined through an error analysis technique using an accuracy verification index of Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and coefficient of determination (R-2). The optimized model estimation results showed that the explicit error ranges were estimated (NO2=+/- 0.035 ppm, CO=+/- 0.071 ppm, SO2=+/- 0.0008 ppm, O-3=+/- 0.006 ppm, PM10=+/- 6.395 mu g/m(3), PM2.5=+/- 3.772 mu g/m(3)). Using the optimized model determined by the highest grade acquisition method, the modelling results during a year of 2019 showed relatively high accuracy as ( NO2= 14.146 +/- 5.864%, CO = 4.289 +/- 1.025%, SO2= 5.572 +/- 1.306%, O-3= 5.549 +/- 0.716%, PM10= 4.031 +/- 0.899%, PM2.5= 3.488 +/- 0.990%) respectively. These prediction results mean a significant level of error within the uncertainty of the model. Therefore, it was proved that the suggested methodology is effective in short-term prediction of air pollutants.
引用
收藏
页码:107 / 127
页数:21
相关论文
共 26 条
  • [1] A k-mean clustering algorithm for mixed numeric and categorical data
    Ahmad, Amir
    Dey, Lipika
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 63 (02) : 503 - 527
  • [2] [Anonymous], 2014, 7 million premature deaths annually linked to air pollution
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] A Machine Learning Approach to Predict Air Quality in California
    Castelli, Mauro
    Clemente, Fabiana Martins
    Popovic, Ales
    Silva, Sara
    Vanneschi, Leonardo
    [J]. COMPLEXITY, 2020, 2020
  • [5] Charrad M, 2014, J STAT SOFTW, V61, P1
  • [6] Cho K, 2019, J KOREAN SOC ATMOS E, V35, P214
  • [7] A Development of Air Quality Forecasting System with Data Assimilation using Surface Measurements in East Asia
    Choi, Dae-Ryun
    Yun, Hui-Young
    Koo, Youn-Seo
    [J]. JOURNAL OF KOREAN SOCIETY FOR ATMOSPHERIC ENVIRONMENT, 2019, 35 (01) : 60 - 85
  • [8] Choi J.S., 1999, Stat. Korea Stat. Anal. Res, V4, P61
  • [9] SUPPORT-VECTOR NETWORKS
    CORTES, C
    VAPNIK, V
    [J]. MACHINE LEARNING, 1995, 20 (03) : 273 - 297
  • [10] Statistical models for the prediction of respirable suspended particulate matter in urban cities
    Goyal, P
    Chan, AT
    Jaiswal, N
    [J]. ATMOSPHERIC ENVIRONMENT, 2006, 40 (11) : 2068 - 2077