Performance Modeling of MapReduce Jobs in Heterogeneous Cloud Environments

被引:45
|
作者
Zhang, Zhuoyao [1 ]
Cherkasova, Ludmila [2 ]
Boon Thau Loo [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Hewlett Packard Labs, Palo Alto, CA USA
来源
2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013) | 2013年
关键词
MapReduce; heterogeneous clusters; performance modeling; efficiency;
D O I
10.1109/CLOUD.2013.107
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many companies start using Hadoop for advanced data analytics over large datasets. While a traditional Hadoop cluster deployment assumes a homogeneous cluster, many enterprise clusters are grown incrementally over time, and might have a variety of different servers in the cluster. The nodes' heterogeneity represents an additional challenge for efficient cluster and job management. Due to resource heterogeneity, it is often unclear which resources introduce inefficiency and bottlenecks, and how such a Hadoop cluster should be configured and optimized. In this work(1), we explore the efficiency and performance accuracy of the bounds-based performance model for predicting the MapReduce job completion times in heterogeneous Hadoop clusters. We validate the accuracy of the proposed performance model using a diverse set of 13 realistic applications and two different heterogeneous clusters. Since one of the Hadoop clusters is formed by different capacity VM instances in Amazon EC2 environment, we additionally explore and discuss factors that impact the MapReduce job performance in the Cloud.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 50 条
  • [1] MrHeter: improving MapReduce performance in heterogeneous environments
    Zhang, Xiao
    Wu, Yanjun
    Zhao, Chen
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (04): : 1691 - 1701
  • [2] MrHeter: improving MapReduce performance in heterogeneous environments
    Xiao Zhang
    Yanjun Wu
    Chen Zhao
    Cluster Computing, 2016, 19 : 1691 - 1701
  • [3] Enhancing Performance of MapReduce Framework in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Sastry, V. N.
    2015 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS (ADCOM), 2015, : 51 - 54
  • [4] A Combined Analytical Modeling Machine Learning Approach for Performance Prediction of MapReduce Jobs in Cloud Environment
    Ataie, Ehsan
    Gianniti, Eugenio
    Ardagna, Danilo
    Movaghar, Ali
    PROCEEDINGS OF 2016 18TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC), 2016, : 431 - 439
  • [5] MapReduce++ - Efficient processing of MapReduce jobs in the cloud
    Zhang, Guigang
    Li, Chao
    Zhang, Yong
    Xing, Chunxiao
    Yang, Jijiang
    Journal of Computational Information Systems, 2012, 8 (14): : 5757 - 5764
  • [6] Achieving Elasticity for Cloud MapReduce Jobs
    Salah, Khaled
    Calero, Jose M. Alcaraz
    PROCEEDINGS OF THE 2013 IEEE 2ND INTERNATIONAL CONFERENCE ON CLOUD NETWORKING (CLOUDNET), 2013, : 195 - 199
  • [7] Improving MapReduce Performance in Heterogeneous Environments with Adaptive Task Tuning
    Cheng, Dazhao
    Rao, Jia
    Guo, Yanfei
    Zhou, Xiaobo
    ACM/IFIP/USENIX MIDDLEWARE 2014, 2014, : 97 - 108
  • [8] Improving MapReduce Performance by Data Prefetching in Heterogeneous or Shared Environments
    Gu, Tao
    Zuo, Chuang
    Liao, Qun
    Yang, Yulu
    Li, Tao
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (05): : 71 - 81
  • [9] A data locality based scheduler to enhance MapReduce performance in heterogeneous environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Bapu, Tapas B. R.
    Anitha, R.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 90 : 423 - 434
  • [10] A Usage-Aware Scheduler for Improving MapReduce Performance in Heterogeneous Environments
    Hsiao, J. H.
    Kao, S. J.
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1647 - +