MrHeter: improving MapReduce performance in heterogeneous environments

被引:0
|
作者
Xiao Zhang
Yanjun Wu
Chen Zhao
机构
来源
Cluster Computing | 2016年 / 19卷
关键词
MapReduce; Heterogeneous cluster; Scheduling; Performance;
D O I
暂无
中图分类号
学科分类号
摘要
As GPUs, ARM CPUs and even FPGAs are widely used in modern computing, a data center gradually develops towards the heterogeneous clusters. However, many well-known programming models such as MapReduce are designed for homogeneous clusters and have poor performance in heterogeneous environments. In this paper, we reconsider the problem and make four contributions: (1) We analyse the causes of MapReduce poor performance in heterogeneous clusters, and the most important one is unreasonable task allocation between nodes with different computing ability. (2) Based on this, we propose MrHeter, which separates MapReduce process into map-shuffle stage and reduce stage, then constructs optimization model separately for them and gets different task allocation mlij,mrij,rij\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ml_{ij}, mr_{ij}, r_{ij}$$\end{document} for heterogeneous nodes based on computing ability.(3) In order to make it suitable for dynamic execution, we propose D-MrHeter, which includes monitor and feedback mechanism. (4) Finally, we prove that MrHeter and D-MrHeter can greatly decrease total execution time of MapReduce from 30 to 70 % in heterogeneous cluster comparing with original Hadoop, having better performance especially in the condition of heavy-workload and large-difference between nodes computing ability.
引用
收藏
页码:1691 / 1701
页数:10
相关论文
共 50 条
  • [1] MrHeter: improving MapReduce performance in heterogeneous environments
    Zhang, Xiao
    Wu, Yanjun
    Zhao, Chen
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (04): : 1691 - 1701
  • [2] Improving MapReduce Performance by Data Prefetching in Heterogeneous or Shared Environments
    Gu, Tao
    Zuo, Chuang
    Liao, Qun
    Yang, Yulu
    Li, Tao
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (05): : 71 - 81
  • [3] A Usage-Aware Scheduler for Improving MapReduce Performance in Heterogeneous Environments
    Hsiao, J. H.
    Kao, S. J.
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1647 - +
  • [4] Enhancing Performance of MapReduce Framework in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Sastry, V. N.
    2015 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS (ADCOM), 2015, : 51 - 54
  • [5] Load Balancing in Heterogeneous MapReduce Environments
    Fan, Yuanquan
    Wu, Weiguo
    Qian, Depei
    Xu, Yunlong
    Wei, Wei
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 1480 - 1489
  • [6] Performance Modeling of MapReduce Jobs in Heterogeneous Cloud Environments
    Zhang, Zhuoyao
    Cherkasova, Ludmila
    Boon Thau Loo
    2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 839 - 846
  • [7] Improving Hadoop MapReduce performance on heterogeneous single board computer clusters☆
    Lim, Sooyoung
    Park, Dongchul
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 160 : 752 - 766
  • [8] Improving MapReduce heterogeneous performance using KNN fair share scheduling
    Kalia, Khushboo
    Dixit, Saurav
    Kumar, Kaushal
    Gera, Rajat
    Epifantsev, Kirill
    John, Vinod
    Taskaeva, Natalia
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 157
  • [9] TMaR: a two-stage MapReduce scheduler for heterogeneous environments
    Maleki, Neda
    Faragardi, Hamid Reza
    Rahmani, Amir Masoud
    Conti, Mauro
    Lofstead, Jay
    HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2020, 10 (01)
  • [10] A Throughput Driven Task Scheduler for Improving MapReduce Performance in Job-intensive Environments
    Wang, Xite
    Shen, Derong
    Yu, Ge
    Nie, Tiezheng
    Kou, Yue
    2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, : 211 - 218