CSIndex: A Coprocessor-Based Classified Secondary Index Mechanism for Efficient HBase Query

被引:2
作者
Zou, Zhe [1 ]
Zheng, Linjiang [1 ]
Xia, Dong [1 ]
Chen, Yiwei [1 ]
Liu, Weining [1 ]
Chen, Yixiong [1 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
来源
2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019) | 2019年
基金
国家重点研发计划;
关键词
HBase; Secondary index; Coprocessor; Memory index;
D O I
10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00131
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of big data, HBase has been widely used in many business areas due to its good performance of massive data storage and management. Unfortunately, the native HBase only optimizes index for rowkey without creating indexes to the non-key column. A full table scan has to be used when querying the non-key column data, which greatly affects the efficiency of complex condition queries. In this paper, we present CSIndex, which is a coprocessor-based classified secondary index mechanism in HBase. CSIndex proposes the Observer-based secondary index management model to ensure the colocation of relevant data. Furthermore, according to the different data characteristics and query requirements, CSIndex designs the classified memory index model to balance query performance and index performance. On this basis, CSIndex proposes the Endpoint-based parallel query algorithm to reduce data transmission overhead, which improves query performance effectively. Finally, experiments are conducted on real datasets of vehicle trajectory. The results show that the query performance of CSIndex is significantly improved compared with the Solr-based scheme and HiBase, and has better overall performance.
引用
收藏
页码:897 / 904
页数:8
相关论文
共 21 条
  • [1] Artini M, 2014, INFORM TECHNOL LIBR, V33, P22
  • [2] Chee-Yong Chan, 1998, SIGMOD Record, V27, P355, DOI 10.1145/276305.276336
  • [3] Cui B, 2004, IEEE T KNOWL DATA EN, V16, P870
  • [4] Cui Chen, 2018, Journal of Computer Applications, V38, P1584, DOI 10.11772/j.issn.1001-9081.2017112777
  • [5] LCIndex: A Local and Clustering Index on Distributed Ordered Tables for Flexible Multi-Dimensional Range Queries
    Feng, Chen
    Yang, Xi
    Liang, Fan
    Sun, Xian-He
    Xu, Zhiwei
    [J]. 2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 719 - 728
  • [6] [葛微 Ge Wei], 2016, [计算机学报, Chinese Journal of Computers], V39, P140
  • [7] George L., 2011, HBase: The Definitive Guide: Random Access to Your Planet-Size Data
  • [8] Liu Y., 2013, J COMPUTATIONAL INFO, V9, P4831
  • [9] Liu Y, 2014, CHINA COMMUN, V11, P1, DOI 10.1109/CC.2014.7019834
  • [10] 云数据管理索引技术研究
    马友忠
    孟小峰
    [J]. 软件学报, 2015, 26 (01) : 145 - 166