A Machine Learning Approach to Predict Air Quality in California

被引:140
作者
Castelli, Mauro [1 ]
Clemente, Fabiana Martins [1 ]
Popovic, Ales [1 ,2 ]
Silva, Sara [3 ]
Vanneschi, Leonardo [1 ,3 ]
机构
[1] Univ Nova Lisboa, NOVA Informat Management Sch NOVA IMS, Campus Campolide, P-1070312 Lisbon, Portugal
[2] Univ Ljubljana, Sch Econ & Business, Ljubljana 1000, Slovenia
[3] Univ Lisbon, Fac Ciencias, Dept Informat, LASIGE, P-1749016 Lisbon, Portugal
关键词
SUPPORT VECTOR MACHINES; CLASSIFICATION; POLLUTION; IMPACT; MATTER; MODEL;
D O I
10.1155/2020/8049504
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Predicting air quality is a complex task due to the dynamic nature, volatility, and high variability in time and space of pollutants and particulates. At the same time, being able to model, predict, and monitor air quality is becoming more and more relevant, especially in urban areas, due to the observed critical impact of air pollution on citizens' health and the environment. In this paper, we employ a popular machine learning method, support vector regression (SVR), to forecast pollutant and particulate levels and to predict the air quality index (AQI). Among the various tested alternatives, radial basis function (RBF) was the type of kernel that allowed SVR to obtain the most accurate predictions. Using the whole set of available variables revealed a more successful strategy than selecting features using principal component analysis. The presented results demonstrate that SVR with RBF kernel allows us to accurately predict hourly pollutant concentrations, like carbon monoxide, sulfur dioxide, nitrogen dioxide, ground-level ozone, and particulate matter 2.5, as well as the hourly AQI for the state of California. Classification into six AQI categories defined by the US Environmental Protection Agency was performed with an accuracy of 94.1% on unseen validation data.
引用
收藏
页数:23
相关论文
共 49 条
[1]  
Alon Ilan., 2001, Journal of Retailing and Consumer Services, V8, P147, DOI [10.1016/S0969-6989, DOI 10.1016/S0969-6989]
[2]  
[Anonymous], 1990, Proceedings of the International Joint Conference on Neural Networks
[3]  
Arampongsanuwat S, 2011, INT PROC COMPUT SCI, V6, P120
[4]   Prediction of the Level of Air Pollution Using Principal Component Analysis and Artificial Neural Network Techniques: a Case Study in Malaysia [J].
Azid, Azman ;
Juahir, Hafizan ;
Toriman, Mohd Ekhwan ;
Kamarudin, Mohd Khairul Amri ;
Saudi, Ahmad Shakir Mohd ;
Hasnam, Che Noraini Che ;
Aziz, Nor Azlina Abdul ;
Azaman, Fazureen ;
Latif, Mohd Talib ;
Zainuddin, Syahrir Farihan Mohamed ;
Osman, Mohamad Romizan ;
Yamin, Mohammad .
WATER AIR AND SOIL POLLUTION, 2014, 225 (08)
[5]  
Basak D., 2007, Neural Inf. Process., V11, P203
[6]  
Batista GEAPA, 2003, APPL ARTIF INTELL, V17, P519, DOI [10.1080/713827181, 10.1080/08839510390219309]
[7]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[8]  
Bontempi G, 2013, LECT NOTES BUS INF P, V138, P62
[9]  
Box G. E. P., 1970, Time Series Analysis: Forecasting and Control, DOI DOI 10.1080/01621459.1970.10481180
[10]   Support Vector Machines for classification and regression [J].
Brereton, Richard G. ;
Lloyd, Gavin R. .
ANALYST, 2010, 135 (02) :230-267