On big data benchmarking

被引：13

作者：

Han, Rui ^{[1
]}

Xiaoyi, Lu ^{[2
]}

jiangtao, Xu ^{[3
]}

机构：

[1] Department of Computing, Imperial College London, London

[2] Ohio State University, Columbus

[3] Beijing Jiaotong University, Beijing

来源：

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2014年 / 8807卷

关键词：

Benchmark; Big data systems; Data; Tests;

D O I：

10.1007/978-3-319-13021-7_1

中图分类号：

学科分类号：

摘要：

Big data systems address the challenges of capturing, storing, managing, analyzing, and visualizing big data. Within this context, developing benchmarks to evaluate and compare big data systems has become an active topic for both research and industry communities. To date, most of the state-of-the-art big data benchmarks are designed for specific types of systems. Based on our experience, however, we argue that considering the complexity, diversity, and rapid evolution of big data systems, for the sake of fairness, big data benchmarks must include diversity of data and workloads. Given this motivation, in this paper, we first propose the key requirements and challenges in developing big data benchmarks from the perspectives of generating data with 4V properties (i.e. volume, velocity, variety and veracity) of big data, as well as generating tests with comprehensive workloads for big data systems. We then present the methodology on big data benchmarking designed to address these challenges. Next, the state-of-the-art are summarized and compared, following by our vision for future research directions. © Springer International Publishing Switzerland 2014.

引用

页码：3 / 18

页数：15

共 20 条

[1]

Big Data Benchmark by Amplab of Uc Berkeley, (2013)

[2]

Gridmix, (2013)

[3]

Ibm Big Data Platform, (2013)

[4]

(2013)

[5]

Sort Benchmark, (2013)

[6]

(2013)

[7]

Tpc Transaction Processing Performance Council, (2013)

[8]

Armstrong T.G., Ponnekanti V., Borthakur D., Callaghan M., Linkbench: A database benchmark based on the facebook social graph, Proceedings of the 2013 International Conference on Management of Data, pp. 1185-1196, (2013)

[9]

Blei D.M., Ng A.Y., Jordan M.I., Latent dirichlet allocation, J. Mach. Learn. Res, 3, pp. 993-1022, (2003)

[10]

Cooper B.F., Silberstein A., Tam E., Ramakrishnan R., Sears R., Benchmarking cloud serving systems with YCSB, Proceedings of the 1St ACM Symposium on Cloud Computing, pp. 143-154, (2010)

← 1 2 →