Learned index for non-key queries

被引：0

作者：

Zhu, Rui ^{[1
]}

Wang, Hongzhi ^{[1
]}

Xia, Sheng ^{[1
]}

Zheng, Bo ^{[2
]}

机构：

[1] Harbin Inst Technol, Comp Sci & Technol, 92 Xidazhi St, Harbin 150000, Heilongjiang, Peoples R China

[2] CnosDB, Beijing 100000, Peoples R China

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2025年 / 67卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Bloom filter; Learned index; Non-key query; Index;

D O I：

10.1007/s10115-024-02233-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learned indexes have attracted a lot of interest lately due to their superior performance over conventional indexes. When there is a lot of data traffic, the learned index efficiently addresses the issue of the standard index's large memory usage. In this paper, we concentrate on a well-known learned index, the recursive model index (RMI). Since the machine learning model is unbiased while calculating, when there are too many non-key queried, the model will calculate the position of the key as if it were positive key, which wastes a lot of time on unnecessary calculations. To deal with this condition, we propose a hierarchical learned index structure based on Bloom filter named HBFdex. HBFdex can effectively prune non-keys, which means most non-key return in layer of BF before they get to machine learning model. By lowering the number of layers traversed by non-key and the time spent looking for non-key within the error bound that is provided by machine learning model, HBFdex decreases the average query time of learned index. We compare HBFdex with B-Tree and RMI, and the results prove that our new structure optimizes the performance of RMI in the case of non-key queries.

引用

页码：497 / 519

页数：23

共 32 条

[1] Abu-Libdeh H, 2020, ARXIV
[2] [Anonymous], 2010, P 2010 ACM SIGMOD IN, DOI DOI 10.1145/1807167.1807206
[3] Bayer R., 1971, Organization and maintenance of large ordered indices
[4] Adaptive Learned Bloom Filters under Incremental Workloads
Bhattacharya, Arindam
Bedathur, Srikanta
Bagchi, Amitabha
[J]. PROCEEDINGS OF THE 7TH ACM IKDD CODS AND 25TH COMAD (CODS-COMAD 2020), 2020, : 107 - 115
[5] SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS
BLOOM, BH
[J]. COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 422 - &
[6] Boehm M., 2011, Datenbanksysteme fur Business, Technologie und Web (BTW)
[7] Boffa A, 2021, Algorithm Eng Exp, P46
[8] Random forests
Breiman, L
[J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
[9] Learned FBF: Learning-Based Functional Bloom Filter for Key-Value Storage
Byun, Hayoung
Lim, Hyesook
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (08) : 1928 - 1938
[10] UBIQUITOUS B-TREE
COMER, D
[J]. COMPUTING SURVEYS, 1979, 11 (02) : 121 - 137

← 1 2 3 4 →