A learned spatial textual index for efficient keyword queries

被引:1
作者
Ding, Xiaofeng [1 ]
Zheng, Yinting [1 ]
Wang, Zuan [1 ]
Choo, Kim-Kwang Raymond [2 ]
Jin, Hai [1 ]
机构
[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp, Wuhan 430074, Hubei, Peoples R China
[2] Univ Texas San Antonio, San Antonio, TX 78249 USA
基金
中国国家自然科学基金;
关键词
Learned index; Spatial textual data; Query processing; Index optimization; SEARCH;
D O I
10.1007/s10844-022-00752-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spatial textual indexing techniques allow one to efficiently access and process large volume of geospatial data, and recent research efforts have demonstrated that learned indexes can lead to better performance in comparison to conventional indexes. In this paper, we present a learned spatial textual index designed to process spatial textual data efficiently. Specifically, our proposed index is constructed based on the idea of radix table, spline points, and inverted lists. Besides, Morton encoding was used to convert high-dimensional coordinates into one dimension. In order to handle data insertion, deletion, and update in real-time, a gap array is used to store the underlying data, and a space reallocation strategy in units of spline points is designed. Based on the index, we propose query processing algorithms to handle different spatial keyword queries efficiently. An optimizer using random forest regression model was also designed to obtain appropriate index parameters for minimizing query latency. We evaluate our proposed index with IR-tree, and the findings show that our index outperforms IR-tree in terms of construction time, index size, and query efficiency.
引用
收藏
页码:803 / 827
页数:25
相关论文
共 40 条
[11]   Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads [J].
Ding, Jialin ;
Nathan, Vikram ;
Alizadeh, Mohammad ;
Kraska, Tim .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (02) :74-86
[12]  
Faloutsos C., 1994, Journal of Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, V3, P231, DOI 10.1007/BF00962238
[13]  
Ferragina P., 2019, Recent Trends in Learning From Data (Recent Trends in Learning From Data)., V896, P5, DOI DOI 10.1007/978-3-030-43883-82
[14]   The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds [J].
Ferragina, Paolo ;
Vinciguerra, Giorgio .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (08) :1162-1175
[15]  
Finkel R. A., 1974, Acta Informatica, V4, P1, DOI 10.1007/BF00288933
[16]   FITing-Tree: A Data-aware Index Structure [J].
Galakatos, Alex ;
Markovitch, Michael ;
Binnig, Carsten ;
Fonseca, Rodrigo ;
Kraska, Tim .
SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, :1189-1206
[17]   Efficient Collective Spatial Keyword Query Processing on Road Networks [J].
Gao, Yunjun ;
Zhao, Jingwen ;
Zheng, Baihua ;
Chen, Gang .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 17 (02) :469-480
[18]   Efficient Reverse Top-k Boolean Spatial Keyword Queries on Road Networks [J].
Gao, Yunjun ;
Qin, Xu ;
Zheng, Baihua ;
Chen, Gang .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (05) :1205-1218
[19]  
Guttman A., 1984, SIGMOD Record, V14, P47, DOI 10.1145/971697.602266
[20]  
Khodaei Ali, 2010, Database and Expert Systems Applications. Proceedings 21st International Conference, DEXA 2010, P450, DOI 10.1007/978-3-642-15364-8_37