Performance Modeling of MapReduce Jobs in Heterogeneous Cloud Environments

被引:45
|
作者
Zhang, Zhuoyao [1 ]
Cherkasova, Ludmila [2 ]
Boon Thau Loo [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Hewlett Packard Labs, Palo Alto, CA USA
来源
2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013) | 2013年
关键词
MapReduce; heterogeneous clusters; performance modeling; efficiency;
D O I
10.1109/CLOUD.2013.107
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many companies start using Hadoop for advanced data analytics over large datasets. While a traditional Hadoop cluster deployment assumes a homogeneous cluster, many enterprise clusters are grown incrementally over time, and might have a variety of different servers in the cluster. The nodes' heterogeneity represents an additional challenge for efficient cluster and job management. Due to resource heterogeneity, it is often unclear which resources introduce inefficiency and bottlenecks, and how such a Hadoop cluster should be configured and optimized. In this work(1), we explore the efficiency and performance accuracy of the bounds-based performance model for predicting the MapReduce job completion times in heterogeneous Hadoop clusters. We validate the accuracy of the proposed performance model using a diverse set of 13 realistic applications and two different heterogeneous clusters. Since one of the Hadoop clusters is formed by different capacity VM instances in Amazon EC2 environment, we additionally explore and discuss factors that impact the MapReduce job performance in the Cloud.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 50 条
  • [31] Performance Analysis and Modeling of Video Transcoding Using Heterogeneous Cloud Services
    Li, Xiangbo
    Salehi, Mohsen Amini
    Joshi, Yamini
    Darwich, Mahmoud K.
    Landreneau, Brad
    Bayoumi, Magdy
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (04) : 910 - 922
  • [32] Sharing across Multiple MapReduce Jobs
    Nykiel, Tomasz
    Potamias, Michalis
    Mishra, Chaitanya
    Kollios, George
    Koudas, Nick
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2014, 39 (02):
  • [33] HAT: history-based auto-tuning MapReduce in heterogeneous environments
    Chen, Quan
    Guo, Minyi
    Deng, Qianni
    Zheng, Long
    Guo, Song
    Shen, Yao
    JOURNAL OF SUPERCOMPUTING, 2013, 64 (03): : 1038 - 1054
  • [34] HAT: history-based auto-tuning MapReduce in heterogeneous environments
    Quan Chen
    Minyi Guo
    Qianni Deng
    Long Zheng
    Song Guo
    Yao Shen
    The Journal of Supercomputing, 2013, 64 : 1038 - 1054
  • [35] Tarazu: Optimizing MapReduce On Heterogeneous Clusters
    Ahmad, Faraz
    Chakradhar, Srimat
    Raghunathan, Anand
    Vijaykumar, T. N.
    ACM SIGPLAN NOTICES, 2012, 47 (04) : 61 - 74
  • [36] Novel Scheduling Algorithms for Efficient Deployment of MapReduce Applications in Heterogeneous Computing Environments
    Hsieh, Sun-Yuan
    Chen, Chi-Ting
    Chen, Chi-Hao
    Yen, Tzu-Hsiang
    Hsiao, Hung-Chang
    Buyya, Rajkumar
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2018, 6 (04) : 1080 - 1095
  • [37] Performance Model of MapReduce Iterative Applications for Hybrid Cloud Bursting
    Clemente-Castello, Francisco J.
    Nicolae, Bogdan
    Mayo, Rafael
    Carlos Fernandez, Juan
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (08) : 1794 - 1807
  • [38] An Advanced MapReduce: Cloud MapReduce, Enhancements and Applications
    Dahiphale, Devendra
    Karve, Rutvik
    Vasilakos, Athanasios V.
    Liu, Huan
    Yu, Zhiwei
    Chhajer, Amit
    Wang, Jianmin
    Wang, Chaokun
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2014, 11 (01): : 101 - 115
  • [39] PERFORMANCE EVALUATION OF MAPREDUCE USING FULL VIRTUALISATION ON A DEPARTMENTAL CLOUD
    Gonzalez-Velez, Horacio
    Kontagora, Maryam
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2011, 21 (02) : 275 - 284
  • [40] Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning
    Cheng, Dazhao
    Rao, Jia
    Guo, Yanfei
    Jiang, Changjun
    Zhou, Xiaobo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (03) : 774 - 786