A distributed in-memory key-value store system on heterogeneous CPU–GPU cluster

被引:0
作者
Kai Zhang
Kaibo Wang
Yuan Yuan
Lei Guo
Rubao Li
Xiaodong Zhang
Bingsheng He
Jiayu Hu
Bei Hua
机构
[1] Fudan University,
[2] Google Inc.,undefined
[3] The Ohio State University,undefined
[4] National University of Singapore,undefined
[5] University of Science and Technology of China,undefined
来源
The VLDB Journal | 2017年 / 26卷
关键词
Key-value store; GPU; Heterogeneous systems; Distributed systems; Energy efficiency;
D O I
暂无
中图分类号
学科分类号
摘要
In-memory key-value stores play a critical role in many data-intensive applications to provide high-throughput and low latency data accesses. In-memory key-value stores have several unique properties that include (1) data-intensive operations demanding high memory bandwidth for fast data accesses, (2) high data parallelism and simple computing operations demanding many slim parallel computing units, and (3) a large working set. However, our experiments show that homogeneous multicore CPU systems are increasingly mismatched to the special properties of key-value stores because they do not provide massive data parallelism and high memory bandwidth; the powerful but the limited number of computing cores does not satisfy the demand of the unique data processing task; and the cache hierarchy may not well benefit to the large working set. In this paper, we present the design and implementation of Mega-KV, a distributed in-memory key-value store system on a heterogeneous CPU–GPU cluster. Effectively utilizing the high memory bandwidth and latency hiding capability of GPUs, Mega-KV provides fast data accesses and significantly boosts overall performance and energy efficiency over the homogeneous CPU architectures. Mega-KV shows excellent scalability and processes up to 623-million key-value operations per second on a cluster installed with eight CPUs and eight GPUs, while delivering an efficiency of up to 299-thousand operations per Watt (KOPS/W).
引用
收藏
页码:729 / 750
页数:21
相关论文
共 58 条
  • [1] Escriva R(2012)Hyperdex: a distributed, searchable key-value store ACM SIGCOMM Comput. Commun. Rev. 42 25-36
  • [2] Wong B(2010)An integrated GPU power and performance model ACM SIGARCH Comput. Archit. News 38 280-289
  • [3] Sirer EG(2013)An fpga-based in-line accelerator for memcached Comput. Archit. Lett. 13 57-60
  • [4] Hong S(2013)GPUWattch: enabling energy optimizations in GPGPUs ACM SIGARCH Comput. Archit. News 41 487-498
  • [5] Kim H(2013)Thin servers with smart pipes: designing soc accelerators for memcached SIGARCH Comput. Archit. News 41 36-47
  • [6] Lavasani M(2010)The case for ramclouds: scalable high-performance storage entirely in dram SIGOPS Oper. Syst. Rev. 43 92-105
  • [7] Angepat H(2003)Cuckoo hashing J. Algorithms 51 122-144
  • [8] Chiou D(2016)Optimizing performance-per-watt on GPUs in high performance computing Comput. Sci. Res. Dev. 31 185-193
  • [9] Leng J(2015)Hetero-db: next generation high-performance database systems by best utilizing heterogeneous computing and storage resources JCST 30 657-678
  • [10] Hetherington T(2015)A holistic approach to build real-time stream processing system with GPU JPDC 83 44-57