Anomaly Detection and Repairing for Improving Air Quality Monitoring

被引:19
作者
Rollo, Federica [1 ]
Bachechi, Chiara [1 ]
Po, Laura [1 ]
机构
[1] Univ Modena & Reggio Emilia, Enzo Ferrari Engn Dept, I-41121 Modena, Italy
关键词
low-cost sensors; air quality sensors; air quality monitoring; anomaly detection; anomaly repairing; multivariate time series; TIME-SERIES; LOW-COST; OUTLIER DETECTION; NEURAL-NETWORKS; CALIBRATION; ENVIRONMENT; STRATEGIES; PREDICTION; IMPUTATION; SENSORS;
D O I
10.3390/s23020640
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Clean air in cities improves our health and overall quality of life and helps fight climate change and preserve our environment. High-resolution measures of pollutants' concentrations can support the identification of urban areas with poor air quality and raise citizens' awareness while encouraging more sustainable behaviors. Recent advances in Internet of Things (IoT) technology have led to extensive use of low-cost air quality sensors for hyper-local air quality monitoring. As a result, public administrations and citizens increasingly rely on information obtained from sensors to make decisions in their daily lives and mitigate pollution effects. Unfortunately, in most sensing applications, sensors are known to be error-prone. Thanks to Artificial Intelligence (AI) technologies, it is possible to devise computationally efficient methods that can automatically pinpoint anomalies in those data streams in real time. In order to enhance the reliability of air quality sensing applications, we believe that it is highly important to set up a data-cleaning process. In this work, we propose AIrSense, a novel AI-based framework for obtaining reliable pollutant concentrations from raw data collected by a network of low-cost sensors. It enacts an anomaly detection and repairing procedure on raw measurements before applying the calibration model, which converts raw measurements to concentration measurements of gasses. There are very few studies of anomaly detection in raw air quality sensor data (millivolts). Our approach is the first that proposes to detect and repair anomalies in raw data before they are calibrated by considering the temporal sequence of the measurements and the correlations between different sensor features. If at least some previous measurements are available and not anomalous, it trains a model and uses the prediction to repair the observations; otherwise, it exploits the previous observation. Firstly, a majority voting system based on three different algorithms detects anomalies in raw data. Then, anomalies are repaired to avoid missing values in the measurement time series. In the end, the calibration model provides the pollutant concentrations. Experiments conducted on a real dataset of 12,000 observations produced by 12 low-cost sensors demonstrated the importance of the data-cleaning process in improving calibration algorithms' performances.
引用
收藏
页数:21
相关论文
共 69 条
[1]  
Akaike H, 2011, International encyclopedia of statistical science, P25, DOI [DOI 10.1007/978-3-642-04898-2_110, 10.1007/978-3-642-04898-2_110]
[2]   Evaluating a Novel Gas Sensor for Ambient Monitoring in Automated Life Science Laboratories [J].
Al-Okby, Mohammed Faeik Ruzaij ;
Roddelkopf, Thomas ;
Fleischer, Heidi ;
Thurow, Kerstin .
SENSORS, 2022, 22 (21)
[3]   Smart and Portable Air-Quality Monitoring IoT Low-Cost Devices in Ibarra City, Ecuador [J].
Alvear-Puertas, Vanessa E. ;
Burbano-Prado, Yadira A. ;
Rosero-Montalvo, Paul D. ;
Tozun, Pinar ;
Marcillo, Fabricio ;
Hernandez, Wilmar .
SENSORS, 2022, 22 (18)
[4]  
[Anonymous], 2006, An introduction to the kalman filter
[5]  
Antonacci Y, 2021, EUR SIGNAL PR CONF, P940, DOI 10.23919/Eusipco47968.2020.9287405
[6]  
Bachechi C., 2022, STUD COMPUT INTELL, V1014, P485, DOI [10.1007/978-3-030-93119-3_19, DOI 10.1007/978-3-030-93119-3_19, 10.1007/978-3-030-93119, DOI 10.1007/978-3-030-93119]
[7]  
Bachechi C., 2020, P 2020 IEEEACS 17 IN, P1, DOI 10.1109/AICCSA50499.2020.9316534
[8]   Anomaly Detection in Multivariate Spatial Time Series: A Ready-to-Use Implementation [J].
Bachechi, Chiara ;
Rollo, Federica ;
Po, Laura ;
Quattrini, Fabio .
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST), 2021, :509-517
[9]   Detection and classification of sensor anomalies for simulating urban traffic scenarios [J].
Bachechi, Chiara ;
Rollo, Federica ;
Po, Laura .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2022, 25 (04) :2793-2817
[10]   Visual analytics for spatio-temporal air quality data [J].
Bachechi, Chiara ;
Desimoni, Federico ;
Po, Laura ;
Martinez Casas, David .
2020 24TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV 2020), 2020, :460-466