RSTORE: A Distributed Multi-version Document Store

被引:5
作者
Bhattacherjee, Souvik [1 ]
Deshpande, Amol [1 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
来源
2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE) | 2018年
关键词
D O I
10.1109/ICDE.2018.00043
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We address the problem of compactly storing a large number of versions (snapshots) of a collection of keyed documents or records in a distributed environment, while efficiently answering a variety of retrieval queries over those, including retrieving full or partial versions, and evolution histories for specific keys. We motivate the increasing need for such a system in a variety of application domains, carefully explore the design space for building such a system and the various storage-computation-retrieval trade-offs, and discuss how different storage layouts influence those trade-offs. We propose a novel system architecture that satisfies the key desiderata for such a system, and offers simple tuning knobs that allow adapting to a specific data and query workload. Our system is intended to act as a layer on top of a distributed key-value store that houses the raw data as well as any indexes. We design novel off-line storage layout algorithms for efficiently partitioning the data to minimize the storage costs while keeping the retrieval costs low. We also present an online algorithm to handle new versions being added to system. Using extensive experiments on large datasets, we demonstrate that our system operates at the scale required in most practical scenarios and often outperforms standard baselines, including a delta-based storage engine, by orders-of-magnitude.
引用
收藏
页码:389 / 400
页数:12
相关论文
共 25 条
[21]  
Miao H., 2017, ICDE
[22]   A Survey and Classification of Storage Deduplication Systems [J].
Paulo, Joao ;
Pereira, Jose .
ACM COMPUTING SURVEYS, 2014, 47 (01)
[23]  
Pavlo A., 2012, SIGMOD
[24]  
SALZBERG B, 1999, ACM COMPUTING SURVEY
[25]  
Snodgrass R., 1985, SIGMOD Record, V14, P236, DOI 10.1145/971699.318921