An Efficient Indexing Approach for Continuous Spatial Approximate Keyword Queries over Geo-Textual Streaming Data

被引:7
作者
Deng, Ze [1 ,2 ]
Wang, Meng [1 ,2 ]
Wang, Lizhe [1 ,2 ]
Huan, Xiaohui [1 ,2 ]
Han, Wei [1 ,2 ]
Chu, Junde [1 ,2 ]
Zomaya, Albert Y. [3 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Hubei, Peoples R China
[2] China Univ Geosci, Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430074, Hubei, Peoples R China
[3] Univ Sydney, Sch Informat Technol, Sydney, NSW 2006, Australia
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
continuous query; spatial approximate keyword matching; indexing methods; GPU;
D O I
10.3390/ijgi8020057
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current social-network-based and location-based-service applications need to handle continuous spatial approximate keyword queries over geo-textual streaming data of high density. The continuous query is a well-known expensive operation. The optimization of continuous query processing is still an open issue. For geo-textual streaming data, the performance issue is more serious since both location information and textual description need to be matched for each incoming streaming data tuple. The state-of-the-art continuous spatial-keyword query indexing approaches generally lack both support for approximate keyword matching and high-performance processing for geo-textual streaming data. Aiming to tackle this problem, this paper first proposes an indexing approach for efficient supporting of continuous spatial approximate keyword queries by integrating min-wise signatures into an AP-tree, namely AP-tree(+). AP-tree(+) utilizes the one-permutation min-wise hashing method to achieve a much lower signature maintenance costs compared with the traditional min-wise hashing method because it only employs one hashing function instead of dozens. Towards providing a more efficient indexing approach, this paper has explored the feasibility of parallelizing AP-tree(+) by employing a Graphic Processing Unit (GPU). We mapped the AP-tree(+) data structure into the GPU's memory with a variety of one-dimensional arrays to form the GPU-aided AP-tree(+). Furthermore, a min-wise parallel hashing algorithm with a scheme of data parallel and a GPU-CPU data communication method based on a four-stage pipeline way have been used to optimize the performance of the GPU-aided AP-tree(+). The experimental results indicate that (1) AP-tree(+) can reduce the space cost by about 11% compared with MHR-tree, (2) AP-tree(+) can hold a comparable recall and 5.64 x query performance gain compared with MHR-tree while saving 41.66% maintenance cost on average, (3) the GPU-aided AP-tree(+) can attain an average speedup of 5.76 x compared to AP-tree(+), and (4) the GPU-CPU data communication scheme can further improve the query performance of the GPU-aided AP-tree(+) by 39.4%.
引用
收藏
页数:20
相关论文
共 31 条
[1]  
Babu S, 2001, SIGMOD REC, V30, P109, DOI 10.1145/603867.603884
[2]   GeoStreams: A Survey [J].
Brandt, Tobias ;
Grawunder, Marco .
ACM COMPUTING SURVEYS, 2018, 51 (03)
[3]  
Broder A. Z., 1998, Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, P327, DOI 10.1145/276698.276781
[4]  
Chen L., 2013, SIGMOD
[5]   Design Automation for Interwell Connectivity Estimation in Petroleum Cyber-Physical Systems [J].
Chen, Xiaodao ;
Zhang, Dongmei ;
Wang, Lizhe ;
Jia, Ning ;
Kang, Zhijiang ;
Zhang, Yun ;
Hu, Shiyan .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (02) :255-264
[6]  
Cruz Mateus S. H., 2015, Database and Expert Systems Applications. 26th International Conference, DEXA 2015. Proceedings: LNCS 9261, P384, DOI 10.1007/978-3-319-22849-5_26
[7]   A Pivotal Prefix Based Filtering Algorithm for String Similarity Search [J].
Deng, Dong ;
Li, Guoliang ;
Feng, Jianhua .
SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, :673-684
[8]   An Indexing Approach for Efficient Supporting of Continuous Spatial Approximate Keyword Queries [J].
Deng, Ze ;
Wang, Lizhe ;
Chu, Junde ;
Huang, Xiaohui ;
Han, Wei ;
Zomaya, Albert .
IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, :132-139
[9]   Parallel Processing of Dynamic Continuous Queries over Streaming Data Flows [J].
Deng, Ze ;
Wu, Xiaoming ;
Wang, Lizhen ;
Chen, Xiaodao ;
Ranjan, Rajiv ;
Zomaya, Albert ;
Chen, Dan .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (03) :834-846
[10]  
Eom S, 2015, IEEE INT C SEMANT CO, P290, DOI 10.1109/ICOSC.2015.7050822