Location Selection for Air Quality Monitoring With Consideration of Limited Budget and Estimation Error

被引:7
作者
Yu, Zhiyong [1 ]
Chang, Huijuan [2 ]
Yu, Zhiwen [2 ]
Guo, Bin [2 ]
Shi, Rongye [3 ]
机构
[1] Fuzhou Univ, Fuzhou 350108, Peoples R China
[2] Northwestern Polytech Univ, Xian 710072, Peoples R China
[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Air quality; Monitoring; Correlation; Sensors; Uncertainty; Estimation error; Spatiotemporal phenomena; Air quality monitoring; estimation error; active learning; manifold preserving graph reduction; RECRUITMENT; FRAMEWORK;
D O I
10.1109/TMC.2021.3065656
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate acquisition of air quality is important for improving human well-being. However, directly monitoring air quality at all locations is costly. The challenge is how we can select a small number of locations to monitor the air quality such that the estimation error of air quality at other locations can be minimized. In this paper, a general location selection strategy is proposed based on active learning, which involves iterations of a selector and an estimator. We implement four instances of this general strategy to embody it: KAL (Active Learning based on Kriging), TAL (Active Learning based on Regression Tree), KMAL (Active Learning based on Kriging and MPGR) and TMAL (Active Learning based on Regression Tree and MPGR). The estimator of KAL or TAL can estimate the air quality at remaining locations from air quality samples at monitoring locations, leveraging spatial or cross-domain correlation of air quality. The selecting indicators of their selectors are designed to measure the uncertainty of unlabeled samples according to their estimators. KMAL and TMAL are upgrades of the former two, respectively, by introducing MPGR (Manifold Preserving Graph Reduction) to also take the representativeness of unlabeled samples into account. The experimental results show that the proposed strategy can achieve a low estimation error with few monitoring locations. Particularly, given the same budget (i.e., the number of monitoring locations), the estimation error is reduced from about 20 percent of baselines to 15 percent by KAL and to 5 percent by KMAL; and TAML likewise.
引用
收藏
页码:4025 / 4037
页数:13
相关论文
共 54 条
[1]  
[Anonymous], 1998, Proceedings of ICML-98, 15th International Conference on Machine Learning
[2]  
Bejan A. I., 2010, 2010 13th International IEEE Conference on Intelligent Transportation Systems (ITSC 2010), P1256, DOI 10.1109/ITSC.2010.5625144
[3]  
Bejan AI, 2011, IEEE INT C INTELL TR, P746, DOI 10.1109/ITSC.2011.6082904
[4]   Manifold Adaptive Experimental Design for Text Categorization [J].
Cai, Deng ;
He, Xiaofei .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (04) :707-719
[5]  
Campbell C., 2000, P 17 INT C MACH LEAR, P111, DOI DOI 10.5555/645529.657959
[6]   Active Matrix Completion [J].
Chakraborty, Shayok ;
Zhou, Jiayu ;
Balasubramanian, Vineeth ;
Panchanathan, Sethuraman ;
Davidson, Ian ;
Ye, Jieping .
2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, :81-90
[7]   Urban Healthcare Big Data System Based on Crowdsourced and Cloud-Based Air Quality Indicators [J].
Chen, Min ;
Yang, Jun ;
Hu, Long ;
Hossain, M. Shamim ;
Muhammad, Ghulam .
IEEE COMMUNICATIONS MAGAZINE, 2018, 56 (11) :14-20
[8]   MCS-RF: mobile crowdsensing-based air quality estimation with random forest [J].
Feng, Cheng ;
Tian, Ye ;
Gong, Xiangyang ;
Que, Xirong ;
Wang, Wendong .
INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2018, 14 (10)
[9]   Selective sampling using the query by committee algorithm [J].
Freund, Y ;
Seung, HS ;
Shamir, E ;
Tishby, N .
MACHINE LEARNING, 1997, 28 (2-3) :133-168
[10]   Toward Supporting Food Journaling Using Air Quality Data Mining and a Social Robot [J].
Gerina, Federica ;
Pes, Barbara ;
Recupero, Reforgiato Diego ;
Riboni, Daniele .
AMBIENT INTELLIGENCE (AMI 2019), 2019, 11912 :318-323