MrHeter: improving MapReduce performance in heterogeneous environments

被引:0
|
作者
Xiao Zhang
Yanjun Wu
Chen Zhao
机构
来源
Cluster Computing | 2016年 / 19卷
关键词
MapReduce; Heterogeneous cluster; Scheduling; Performance;
D O I
暂无
中图分类号
学科分类号
摘要
As GPUs, ARM CPUs and even FPGAs are widely used in modern computing, a data center gradually develops towards the heterogeneous clusters. However, many well-known programming models such as MapReduce are designed for homogeneous clusters and have poor performance in heterogeneous environments. In this paper, we reconsider the problem and make four contributions: (1) We analyse the causes of MapReduce poor performance in heterogeneous clusters, and the most important one is unreasonable task allocation between nodes with different computing ability. (2) Based on this, we propose MrHeter, which separates MapReduce process into map-shuffle stage and reduce stage, then constructs optimization model separately for them and gets different task allocation mlij,mrij,rij\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ml_{ij}, mr_{ij}, r_{ij}$$\end{document} for heterogeneous nodes based on computing ability.(3) In order to make it suitable for dynamic execution, we propose D-MrHeter, which includes monitor and feedback mechanism. (4) Finally, we prove that MrHeter and D-MrHeter can greatly decrease total execution time of MapReduce from 30 to 70 % in heterogeneous cluster comparing with original Hadoop, having better performance especially in the condition of heavy-workload and large-difference between nodes computing ability.
引用
收藏
页码:1691 / 1701
页数:10
相关论文
共 50 条
  • [21] Analysis of hadoop MapReduce scheduling in heterogeneous environment
    Kalia, Khushboo
    Gupta, Neeraj
    AIN SHAMS ENGINEERING JOURNAL, 2021, 12 (01) : 1101 - 1110
  • [22] Optimizing MapReduce Task Scheduling on Virtualized Heterogeneous Environments Using Ant Colony Optimization
    Jeyaraj, Rathinaraja
    Paul, Anand
    IEEE ACCESS, 2022, 10 : 55842 - 55855
  • [23] Improving MapReduce Performance by Balancing Skewed Loads
    Fan Yuanquan
    Wu Weiguo
    Xu Yunlong
    Chen Heng
    CHINA COMMUNICATIONS, 2014, 11 (08) : 85 - 108
  • [24] NPIY : A novel partitioner for improving mapreduce performance
    Lu, Wei
    Chen, Lei
    Wang, Liqiang
    Yuan, Haitao
    Xing, Weiwei
    Yang, Yong
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2018, 46 : 1 - 11
  • [25] An optimized MapReduce workflow scheduling algorithm for heterogeneous computing
    Tang, Zhuo
    Liu, Min
    Ammar, Almoalmi
    Li, Kenli
    Li, Keqin
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (06): : 2059 - 2079
  • [26] DyScale: A MapReduce Job Scheduler for Heterogeneous Multicore Processors
    Yan, Feng
    Cherkasova, Ludmila
    Zhang, Zhuoyao
    Smirni, Evgenia
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2017, 5 (02) : 317 - 330
  • [27] An optimized MapReduce workflow scheduling algorithm for heterogeneous computing
    Zhuo Tang
    Min Liu
    Almoalmi Ammar
    Kenli Li
    Keqin Li
    The Journal of Supercomputing, 2016, 72 : 2059 - 2079
  • [28] Deadline-based Workload Management for MapReduce Environments: Pieces of the Performance Puzzle
    Verma, Abhishek
    Cherkasova, Ludmila
    Kumar, Vijay S.
    Campbell, Roy H.
    2012 IEEE NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (NOMS), 2012, : 900 - 905
  • [29] Designing a MapReduce performance model in distributed heterogeneous platforms based on benchmarking approach
    Gandomi, Abolfazl
    Movaghar, Ali
    Reshadi, Midia
    Khademzadeh, Ahmad
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (09): : 7177 - 7203
  • [30] Designing a MapReduce performance model in distributed heterogeneous platforms based on benchmarking approach
    Abolfazl Gandomi
    Ali Movaghar
    Midia Reshadi
    Ahmad Khademzadeh
    The Journal of Supercomputing, 2020, 76 : 7177 - 7203