Learned index for non-key queries

被引:0
作者
Zhu, Rui [1 ]
Wang, Hongzhi [1 ]
Xia, Sheng [1 ]
Zheng, Bo [2 ]
机构
[1] Harbin Inst Technol, Comp Sci & Technol, 92 Xidazhi St, Harbin 150000, Heilongjiang, Peoples R China
[2] CnosDB, Beijing 100000, Peoples R China
基金
中国国家自然科学基金;
关键词
Bloom filter; Learned index; Non-key query; Index;
D O I
10.1007/s10115-024-02233-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learned indexes have attracted a lot of interest lately due to their superior performance over conventional indexes. When there is a lot of data traffic, the learned index efficiently addresses the issue of the standard index's large memory usage. In this paper, we concentrate on a well-known learned index, the recursive model index (RMI). Since the machine learning model is unbiased while calculating, when there are too many non-key queried, the model will calculate the position of the key as if it were positive key, which wastes a lot of time on unnecessary calculations. To deal with this condition, we propose a hierarchical learned index structure based on Bloom filter named HBFdex. HBFdex can effectively prune non-keys, which means most non-key return in layer of BF before they get to machine learning model. By lowering the number of layers traversed by non-key and the time spent looking for non-key within the error bound that is provided by machine learning model, HBFdex decreases the average query time of learned index. We compare HBFdex with B-Tree and RMI, and the results prove that our new structure optimizes the performance of RMI in the case of non-key queries.
引用
收藏
页码:497 / 519
页数:23
相关论文
共 32 条
  • [1] Abu-Libdeh H, 2020, ARXIV
  • [2] [Anonymous], 2010, P 2010 ACM SIGMOD IN, DOI DOI 10.1145/1807167.1807206
  • [3] Bayer R., 1971, Organization and maintenance of large ordered indices
  • [4] Adaptive Learned Bloom Filters under Incremental Workloads
    Bhattacharya, Arindam
    Bedathur, Srikanta
    Bagchi, Amitabha
    [J]. PROCEEDINGS OF THE 7TH ACM IKDD CODS AND 25TH COMAD (CODS-COMAD 2020), 2020, : 107 - 115
  • [5] SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS
    BLOOM, BH
    [J]. COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 422 - &
  • [6] Boehm M., 2011, Datenbanksysteme fur Business, Technologie und Web (BTW)
  • [7] Boffa A, 2021, Algorithm Eng Exp, P46
  • [8] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [9] Learned FBF: Learning-Based Functional Bloom Filter for Key-Value Storage
    Byun, Hayoung
    Lim, Hyesook
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (08) : 1928 - 1938
  • [10] UBIQUITOUS B-TREE
    COMER, D
    [J]. COMPUTING SURVEYS, 1979, 11 (02) : 121 - 137