Main-Memory Requirements of Big Data Applications on Commodity Server Platform

被引:6
作者
Makrani, Hosein Mohammadi [1 ]
Rafatirad, Setareh [1 ]
Houmansadr, Amir [2 ]
Homayoun, Houman [1 ]
机构
[1] George Mason Univ, Dept Elect & Comp Engn, Fairfax, VA 22030 USA
[2] Univ Massachusetts, Coll Informat & Comp Sci, Amherst, MA 01003 USA
来源
2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID) | 2018年
关键词
big data; memory; Hadoop; Spark; performance;
D O I
10.1109/CCGRID.2018.00097
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of big data frameworks requires computational and memory resources that can naturally scale to manage massive amounts of diverse data. It is currently unclear whether big data frameworks such as Hadoop, Spark, and MPI will require high bandwidth and large capacity memory to cope with this change. The primary purpose of this study is to answer this question through empirical analysis of different memory configurations available for commodity server and to assess the impact of these configurations on the performance Hadoop and Spark frameworks, and MPI based applications. Our results show that neither DRAM capacity, frequency, nor the number of channels play a critical role on the performance of all studied Hadoop as well as most studied Spark applications. However, our results reveal that iterative tasks (e.g. machine learning) in Spark and MPI are benefiting from a high bandwidth and large capacity memory.
引用
收藏
页码:653 / 660
页数:8
相关论文
共 28 条
  • [11] Huang SS, 2010, I C DATA ENGIN WORKS, P41, DOI 10.1109/ICDEW.2010.5452747
  • [12] Hurt Kathlene., 2015, Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems, P11
  • [13] Issa J., 2015, J CLOUD COMPUTING, V5, P1
  • [14] Jaleel A., 2007, MEMORY CHARACTERIZAT
  • [15] Jia Z, 2013, I S WORKL CHAR PROC, P66, DOI 10.1109/IISWC.2013.6704671
  • [16] Jia Z, 2014, I S WORKL CHAR PROC, P191, DOI 10.1109/IISWC.2014.6983058
  • [17] Jiang T, 2014, I S WORKL CHAR PROC, P22, DOI 10.1109/IISWC.2014.6983036
  • [18] Performance Characterization of Hadoop and DataMPI Based on Amdahl's Second Law
    Liang, Fan
    Feng, Chen
    Lu, Xiaoyi
    Xu, Zhiwei
    [J]. 2014 9TH IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE, AND STORAGE (NAS), 2014, : 207 - 215
  • [19] Makrani Hosein Mohammadi, 2017, 2017 IEEE International Symposium on Workload Characterization (IISWC), P112, DOI 10.1109/IISWC.2017.8167763
  • [20] Makrani H. M., 2017, P INT GREEN SUST COM