CSIndex: A Coprocessor-Based Classified Secondary Index Mechanism for Efficient HBase Query

被引:2
作者
Zou, Zhe [1 ]
Zheng, Linjiang [1 ]
Xia, Dong [1 ]
Chen, Yiwei [1 ]
Liu, Weining [1 ]
Chen, Yixiong [1 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
来源
2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019) | 2019年
基金
国家重点研发计划;
关键词
HBase; Secondary index; Coprocessor; Memory index;
D O I
10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00131
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of big data, HBase has been widely used in many business areas due to its good performance of massive data storage and management. Unfortunately, the native HBase only optimizes index for rowkey without creating indexes to the non-key column. A full table scan has to be used when querying the non-key column data, which greatly affects the efficiency of complex condition queries. In this paper, we present CSIndex, which is a coprocessor-based classified secondary index mechanism in HBase. CSIndex proposes the Observer-based secondary index management model to ensure the colocation of relevant data. Furthermore, according to the different data characteristics and query requirements, CSIndex designs the classified memory index model to balance query performance and index performance. On this basis, CSIndex proposes the Endpoint-based parallel query algorithm to reduce data transmission overhead, which improves query performance effectively. Finally, experiments are conducted on real datasets of vehicle trajectory. The results show that the query performance of CSIndex is significantly improved compared with the Solr-based scheme and HiBase, and has better overall performance.
引用
收藏
页码:897 / 904
页数:8
相关论文
共 21 条
  • [21] Zou YQ, 2010, LECT NOTES COMPUT SC, V6289, P247, DOI 10.1007/978-3-642-15672-4_22