SplitDB: Closing the Performance Gap for LSM-Tree-Based Key-Value Stores

被引:6
作者
Cai, Miao [1 ,2 ]
Jiang, Xuzhen [2 ]
Shen, Junru [2 ]
Ye, Baoliu [1 ,2 ,3 ]
机构
[1] Hohai Univ, Key Lab Water Big Data Technol, Minist Water Resources, Nanjing 211100, Peoples R China
[2] Hohai Univ, Sch Comp & Informat, Nanjing 211100, Peoples R China
[3] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
基金
中国国家自然科学基金;
关键词
Log-structured merge tree; non-volatile memory; key-value storage;
D O I
10.1109/TC.2023.3326982
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Log Structured Merge Tree (LSM tree) serves as the core data storage engine in modern key-value stores. Its adoption is rapidly accelerated with cloud computing and data center development. Acknowledging its widespread use, the LSM tree still faces severe performance issues such as write stall, write amplification, and read inefficiency. This article presents research on improving LSM-tree-based key-value store performance using emerging Non-Volatile Memory (NVM) technology. Our performance diagnosis reveals that the above-mentioned issues result primarily from intensive hot key-value data processing, which is compounded by slow storage devices. To address hotspot bottlenecks, we propose a split log-structured merge tree over hybrid storage by leveraging the intrinsic hot and cold data separation property of the LSM tree. Our approach promotes frequently accessed, small-sized high levels onto fast NVM and offloads the remaining cold, large-sized low levels into slow devices, effectively closing the performance gap for DRAM-disk-based LSM trees. Additionally, we optimize the split LSM tree read and write performance by proposing a variety of novel techniques. We build a hotspot-aware key-value database named SplitDB and perform extensive experiments. Experimental results demonstrate that SplitDB effectively prevents write stalls, achieves a 6-fold write reduction, and improves read throughputs by 3.5 times compared to state-of-the-art key-value databases.
引用
收藏
页码:206 / 220
页数:15
相关论文
共 36 条
[1]  
[Anonymous], Intel pmwatch
[2]  
[Anonymous], GOOGLE LEVELDB
[3]  
[Anonymous], RocksDB: A Persistent Key-Value Store for Flash and RAM Storage
[4]  
Balmau O, 2019, PROCEEDINGS OF THE 2019 USENIX ANNUAL TECHNICAL CONFERENCE, P753
[5]  
Conway A, 2020, PROCEEDINGS OF THE 2020 USENIX ANNUAL TECHNICAL CONFERENCE, P49
[6]   An improved data stream summary: the count-min sketch and its applications [J].
Cormode, G ;
Muthukrishnan, S .
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2005, 55 (01) :58-75
[7]   Maximizing Persistent Memory Bandwidth Utilization for OLAP Workloads [J].
Daase, Bjorn ;
Bollmeier, Lars Jonas ;
Benson, Lawrence ;
Rabl, Tilmann .
SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, :339-351
[8]  
Dai YF, 2020, PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), P155
[9]  
David T, 2018, PROCEEDINGS OF THE 2018 USENIX ANNUAL TECHNICAL CONFERENCE, P373
[10]  
DeCandia Giuseppe, 2007, Operating Systems Review, V41, P205, DOI 10.1145/1323293.1294281