Main-Memory Requirements of Big Data Applications on Commodity Server Platform

被引:6
作者
Makrani, Hosein Mohammadi [1 ]
Rafatirad, Setareh [1 ]
Houmansadr, Amir [2 ]
Homayoun, Houman [1 ]
机构
[1] George Mason Univ, Dept Elect & Comp Engn, Fairfax, VA 22030 USA
[2] Univ Massachusetts, Coll Informat & Comp Sci, Amherst, MA 01003 USA
来源
2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID) | 2018年
关键词
big data; memory; Hadoop; Spark; performance;
D O I
10.1109/CCGRID.2018.00097
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of big data frameworks requires computational and memory resources that can naturally scale to manage massive amounts of diverse data. It is currently unclear whether big data frameworks such as Hadoop, Spark, and MPI will require high bandwidth and large capacity memory to cope with this change. The primary purpose of this study is to answer this question through empirical analysis of different memory configurations available for commodity server and to assess the impact of these configurations on the performance Hadoop and Spark frameworks, and MPI based applications. Our results show that neither DRAM capacity, frequency, nor the number of channels play a critical role on the performance of all studied Hadoop as well as most studied Spark applications. However, our results reveal that iterative tasks (e.g. machine learning) in Spark and MPI are benefiting from a high bandwidth and large capacity memory.
引用
收藏
页码:653 / 660
页数:8
相关论文
共 28 条
  • [1] Hadoop Characterization
    Alzuru, Icaro
    Long, Kevin
    Li, Tao
    Zimmerman, David
    Gowda, Bhaskar
    [J]. 2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 2, 2015, : 96 - 103
  • [2] BARROSO LA, 1998, ACM SIGARCH COMPUTER, P3
  • [3] Basu Arkaprava, 2013, ACM SIGARCH Comput.Archit. News, DOI 10.1145/2508148.2485943
  • [4] Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server
    Beamer, Scott
    Asanovic, Krste
    Patterson, David
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2015, : 56 - 65
  • [5] Big Data - Opportunities and Challenges
    Bertino, Elisa
    [J]. 2013 IEEE 37TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2013, : 479 - 480
  • [6] The PARSEC Benchmark Suite: Characterization and Architectural Implications
    Bienia, Christian
    Kumar, Sanjeev
    Singh, Jaswinder Pal
    Li, Kai
    [J]. PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, : 72 - 81
  • [7] Quantifying the Performance Impact of Memory Latency and Bandwidth for Big Data Workloads
    Clapp, Russell
    Dimitrov, Martin
    Kumar, Karthik
    Viswanathan, Vish
    Willhalm, Thomas
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2015, : 213 - 224
  • [8] Dimitrov Martin, 2013, 2013 IEEE International Conference on Big Data, P15, DOI 10.1109/BigData.2013.6691693
  • [9] Fengfeng Pan, 2014, Big Data Benchmarks, Performance Optimization, and Emerging Hardware. 4th and 5th Workshops, BPOE 2014. Revised Selected Papers, P85, DOI 10.1007/978-3-319-13021-7_7
  • [10] Clearing the Clouds A Study of Emerging Scale-out Workloads on Modern Hardware
    Ferdman, Michael
    Adileh, Almutaz
    Kocberber, Onur
    Volos, Stavros
    Alisafaee, Mohammad
    Jevdjic, Djordje
    Kaynak, Cansu
    Popescu, Adrian Daniel
    Ailamaki, Anastasia
    Falsafi, Babak
    [J]. ACM SIGPLAN NOTICES, 2012, 47 (04) : 37 - 47