Application and Evaluation of Machine Learning Techniques for Real-time Short-term Prediction of Air Pollutants

被引：2

作者：

Kim, Yeong-Il ^{[1
,3
]}

Lee, Kwon-Ho ^{[2
,3
]}

Park, Seung-Han ^{[1
,3
]}

机构：

[1] Gangneung Wonju Natl Univ, Spatial Informat Cooperat Program, Kangnung, South Korea

[2] Gangneung Wonju Natl Univ, Dept Atmospher & Environm Sci, Kangnung, South Korea

[3] Gangneung Wonju Natl Univ, Res Inst Radiat Satellite, Kangnung, South Korea

来源：

JOURNAL OF KOREAN SOCIETY FOR ATMOSPHERIC ENVIRONMENT | 2023年 / 39卷 / 01期

关键词：

Air quality; Machine learning; PM; Air pollutants; R PACKAGE;

D O I：

10.5572/KOSAE.2023.39.1.107

中图分类号：

P4 [大气科学（气象学）];

学科分类号：

0706 ; 070601 ;

摘要：

In this study, the machine learning (ML) techniques were compared and evaluated for real-time short-term prediction of air pollutants and the accuracy of the prediction results using the optimal prediction technique was analyzed. Air quality data and meteorological data for the last four years (2015 similar to 2018) are used to train and test the ML system. The ML system consists of four models including Random Forest (RF), Support Vector Machine (SVM), Multiple Linear Regression (MLR), and Deep Neural Network (DNN), and the optimal model was determined through an error analysis technique using an accuracy verification index of Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and coefficient of determination (R-2). The optimized model estimation results showed that the explicit error ranges were estimated (NO2=+/- 0.035 ppm, CO=+/- 0.071 ppm, SO2=+/- 0.0008 ppm, O-3=+/- 0.006 ppm, PM10=+/- 6.395 mu g/m(3), PM2.5=+/- 3.772 mu g/m(3)). Using the optimized model determined by the highest grade acquisition method, the modelling results during a year of 2019 showed relatively high accuracy as ( NO2= 14.146 +/- 5.864%, CO = 4.289 +/- 1.025%, SO2= 5.572 +/- 1.306%, O-3= 5.549 +/- 0.716%, PM10= 4.031 +/- 0.899%, PM2.5= 3.488 +/- 0.990%) respectively. These prediction results mean a significant level of error within the uncertainty of the model. Therefore, it was proved that the suggested methodology is effective in short-term prediction of air pollutants.

引用

页码：107 / 127

页数：21

共 26 条

[1] A k-mean clustering algorithm for mixed numeric and categorical data
Ahmad, Amir
Dey, Lipika
[J]. DATA & KNOWLEDGE ENGINEERING, 2007, 63 (02) : 503 - 527
[2] [Anonymous], 2014, 7 million premature deaths annually linked to air pollution
[3] Random forests
Breiman, L
[J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
[4] A Machine Learning Approach to Predict Air Quality in California
Castelli, Mauro
Clemente, Fabiana Martins
Popovic, Ales
Silva, Sara
Vanneschi, Leonardo
[J]. COMPLEXITY, 2020, 2020
[5] Charrad M, 2014, J STAT SOFTW, V61, P1
[6] Cho K, 2019, J KOREAN SOC ATMOS E, V35, P214
[7] A Development of Air Quality Forecasting System with Data Assimilation using Surface Measurements in East Asia
Choi, Dae-Ryun
Yun, Hui-Young
Koo, Youn-Seo
[J]. JOURNAL OF KOREAN SOCIETY FOR ATMOSPHERIC ENVIRONMENT, 2019, 35 (01) : 60 - 85
[8] Choi J.S., 1999, Stat. Korea Stat. Anal. Res, V4, P61
[9] SUPPORT-VECTOR NETWORKS
CORTES, C
VAPNIK, V
[J]. MACHINE LEARNING, 1995, 20 (03) : 273 - 297
[10] Statistical models for the prediction of respirable suspended particulate matter in urban cities
Goyal, P
Chan, AT
Jaiswal, N
[J]. ATMOSPHERIC ENVIRONMENT, 2006, 40 (11) : 2068 - 2077

← 1 2 3 →