An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters

被引:0
|
作者
Zhao, Hui [1 ]
Yang, Shuqiang [2 ]
Fan, Hua [1 ]
Chen, Zhikun [1 ]
Xu, Jinghu [1 ]
机构
[1] Natl Univ Def Technol, Sch Comp, Changsha, Hunan, Peoples R China
[2] Natl Univ Def Technol, Changsha, Hunan, Peoples R China
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2013年 / E96D卷 / 12期
基金
中国国家自然科学基金;
关键词
data-intensive computation; MapReduce; Hadoop; algorithm design; scheduling; grid computing; data locality; cloud computing; flowtime;
D O I
10.1587/transinf.E96.D.2654
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scheduling plays a key role in Map Reduce systems. In this paper, we explore the efficiency of an Map Reduce cluster running lots of independent and continuously arriving Map Reduce jobs. Data locality and load balancing are two important factors to improve computation efficiency in Map Reduce systems for data-intensive computations. Traditional cluster scheduling technologies are not well suitable for Map Reduce environment, there are some in-used schedulers for the popular open-source Hadoop Map Reduce implementation, however, they can not well optimize both factors. Our main objective is to minimize total fiowtime of all jobs, given it's a strong NP-hard problem, we adopt some effective heuristics to seek satisfied solution. In this paper, we formalize the scheduling problem as job selection problem, a load balance aware job selection algorithm is proposed, in task level we design a strict data locality tasks scheduling algorithm for map tasks on map machines and a load balance aware scheduling algorithm for reduce tasks on reduce machines. Comprehensive experiments have been conducted to compare our scheduling strategy with well-known Hadoop scheduling strategies. The experimental results validate the efficiency of our proposed sdheduling strategy.
引用
收藏
页码:2654 / 2662
页数:9
相关论文
共 50 条
  • [1] MapReduce Across Distributed Clusters for Data-intensive Applications
    Wang, Lizhe
    Tao, Jie
    Marten, Holger
    Streit, Achim
    Khan, Samee U.
    Kolodziej, Joanna
    Chen, Dan
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 2004 - 2011
  • [2] TRACON: Interference-Aware Scheduling for Data-Intensive Applications in Virtualized Environments
    Chiang, Ron C.
    Huang, H. Howie
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (05) : 1349 - 1358
  • [3] Accelerating Biomedical Data-Intensive Applications using MapReduce
    Han, Liangxiu
    Ong, Hwee Yong
    2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, : 49 - 57
  • [4] Resource-Aware Adaptive Scheduling for MapReduce Clusters
    Polo, Jorda
    Castillo, Claris
    Carrera, David
    Becerra, Yolanda
    Whalley, Ian
    Steinder, Malgorzata
    Torres, Jordi
    Ayguade, Eduard
    MIDDLEWARE 2011, 2011, 7049 : 187 - +
  • [5] Scheduling Data Intensive Workloads through Virtualization on MapReduce based Clouds
    Rao, B. Thirumala
    Reddy, L. S. S.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2013, 13 (06): : 105 - 112
  • [6] G-Hadoop: MapReduce across distributed data centers for data-intensive computing
    Wang, Lizhe
    Tao, Jie
    Ranjan, Rajiv
    Marten, Holger
    Streit, Achim
    Chen, Jingying
    Chen, Dan
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2013, 29 (03): : 739 - 750
  • [7] Cooperative Job Scheduling and Data Allocation in Data-Intensive Parallel Computing Clusters
    Wang, Haoyu
    Liu, Guoxin
    Shen, Haiying
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 2392 - 2406
  • [8] Efficiency-aware and fairness-aware joint-layer optimization for downlink data scheduling in OFDM
    Guo KunQi
    Sun LiXin
    Wang Ping
    Jia ShiLou
    SCIENCE IN CHINA SERIES F-INFORMATION SCIENCES, 2008, 51 (02): : 171 - 182
  • [9] Efficiency-aware and fairness-aware joint-layer optimization for downlink data scheduling in OFDM
    GUO KunQi~1+ SUN LiXin~2 WANG Ping~1 JIA ShiLou~3 1 Shanghai Research Center for Wireless Communications
    2 School of Measurement & Control Technology and Communication Engineering
    3 Communication Research Center
    ScienceinChina(SeriesF:InformationSciences), 2008, (02) : 171 - 182
  • [10] Efficiency-aware and fairness-aware joint-layer optimization for downlink data scheduling in OFDM
    KunQi Guo
    LiXin Sun
    Ping Wang
    ShiLou Jia
    Science in China Series F: Information Sciences, 2008, 51 : 171 - 182