A Hierarchical Framework for Top-k Location-aware Error-tolerant Keyword Search

被引:21
作者
Yang, Junye [1 ]
Zhang, Yong [1 ]
Zhou, Xiaofang [2 ]
Wang, Jin [3 ]
Hu, Huiqi [4 ]
Xing, Chunxiao [1 ]
机构
[1] Tsinghua Univ, RIIT, TNList, Dept Comp Sci & Technol, Beijing, Peoples R China
[2] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld, Australia
[3] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
[4] East China Normal Univ, Inst Data Sci & Engn, Shanghai, Peoples R China
来源
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019) | 2019年
基金
国家重点研发计划;
关键词
EFFICIENT;
D O I
10.1109/ICDE.2019.00092
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Location-aware services have become widely available on a variety of devices. The resulting fusion of spatio-textual data enables the kind of top-k query that takes into account both location proximity and text relevance. Considering both the misspellings in user input and the data quality issues of spatiotextual databases, it is necessary to support error-tolerant spatial keyword search for end-users. Existing studies mainly focused on set-based textual relevance, but they cannot find reasonable results when the input tokens are not exactly matched with those from records in the database. In this paper, we propose a novel framework to solve the problem of top-k location-aware similarity search with fuzzy token matching. We propose a hierarchical index HGR-Tree to capture signatures of both spatial and textual relevance. Based on such an index structure, we devise a best-first search algorithm to preferentially access nodes of HGR-Tree with more similar objects while those with dissimilar ones can be pruned. We further devise an incremental search strategy to reduce the overhead brought by supporting fuzzy token matching. Experimental results on real world POI datasets show that our framework outperforms state-of-the-art methods by one to two orders of magnitude.
引用
收藏
页码:986 / 997
页数:12
相关论文
共 26 条
[1]  
[Anonymous], 2019, EDBT
[2]  
[Anonymous], 2009, Proc. VLDB Endow.
[3]  
[Anonymous], 2010, AAAI
[4]  
Bouros P, 2012, PROC VLDB ENDOW, V6, P1
[5]  
Chaudhuri S., 2006, ICDE, P5, DOI [10.1109/ICDE.2006.9, DOI 10.1109/ICDE.2006.9]
[6]   Diversity-Aware Top-k Publish/Subscribe for Text Stream [J].
Chen, Lisi ;
Cong, Gao .
SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, :347-362
[7]   Spatial Keyword Query Processing: An Experimental Evaluation [J].
Chen, Lisi ;
Cong, Gao ;
Jensen, Christian S. ;
Wu, Dingming .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (03) :217-228
[8]  
Chen Y.-Y., 2006, P ACM SIGMOD INT C M, P277
[9]   Querying Geo-Textual Data: Spatial Keyword Queries and Beyond [J].
Cong, Gao ;
Jensen, Christian S. .
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, :2207-2212
[10]   Keyword search on spatial databases [J].
De Felipe, Ian ;
Hristidis, Vagelis ;
Rishe, Naphtali .
2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, :656-+