PISCES: Optimizing Multi-Job Application Execution in MapReduce

被引:4
|
作者
Chen, Qi [1 ]
Yao, Jinyu [1 ]
Li, Benchao [1 ]
Xiao, Zhen [1 ]
机构
[1] Peking Univ, Dept Comp Sci, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
MapReduce; job dependency; group scheduling; pipeline; OPTIMIZATION;
D O I
10.1109/TCC.2016.2603509
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, many MapReduce applications consist of groups of jobs with dependencies among each other, such as iterative machine learning applications and large database queries. Unfortunately, the MapReduce framework is not optimized for these multi-job applications. It does not explore the execution overlapping opportunities among jobs and can only schedule jobs independently. These issues significantly inflate the application execution time. This paper presents Pipeline Improvement Support with Critical chain Estimation Scheduling (PISCES), a critical chain optimization (a critical chain refers to a series of jobs which will make the application run longer if any one of them is delayed), to provide better support for multi-job applications. PISCES extends the existing MapReduce framework to allow scheduling for multiple jobs with dependencies by dynamically building up a job dependency DAG for current running jobs according to their input and output directories. Then using the dependency DAG, it provides an innovative mechanism to facilitate the data pipelining between the output phase (map phase in the Map-Only job or reduce phase in the Map-Reduce job) of an upstream job and the map phase of a downstream job. This offers a new execution overlapping between dependent jobs in MapReduce which effectively reduces the application runtime. Moreover, PISCES proposes a novel critical chain job scheduling model based on the accurate critical chain estimation. Experiments show that PISCES can increase the degree of system parallelism by up to 68 percent and improve the execution speed of applications by up to 52 percent.
引用
收藏
页码:273 / 286
页数:14
相关论文
共 50 条
  • [1] Improving Multi-Job MapReduce Scheduling in an Opportunistic Environment
    Ji, Yuting
    Tong, Lang
    He, Ting
    Tan, Jian
    Lee, Kang-won
    Zhang, Li
    2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 9 - 16
  • [2] Self-Learning MapReduce Scheduler in Multi-job Environment
    Lin, Changhang
    Guo, Wenzhong
    Lin, Changhui
    2013 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CLOUDCOM-ASIA), 2013, : 610 - 612
  • [3] SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters
    Gu, Rong
    Yang, Xiaoliang
    Yan, Jinshuang
    Sun, Yuanhao
    Wang, Bing
    Yuan, Chunfeng
    Huang, Yihua
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (03) : 2166 - 2179
  • [4] Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter
    Chen, Lei
    Liu, Zhao-Hua
    SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2019, 13 (04) : 297 - 308
  • [5] Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter
    Lei Chen
    Zhao-Hua Liu
    Service Oriented Computing and Applications, 2019, 13 : 297 - 308
  • [6] Octopus: A Multi-job Scheduler for Graphlab
    Padala, Srikant
    Kumar, Dinesh
    Raj, Arun
    Dharanipragada, Janakiram
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 293 - 298
  • [7] Performance optimization for short job execution in Hadoop MapReduce
    Gu, Rong
    Yan, Jinshuang
    Yang, Xiaoliang
    Yuan, Chunfeng
    Huang, Yihua
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2014, 51 (06): : 1270 - 1280
  • [8] Performance Optimization for Short MapReduce Job Execution in Hadoop
    Yan, Jinshuang
    Yang, Xiaoliang
    Gu, Rong
    Yuan, Chunfeng
    Huang, Yihua
    SECOND INTERNATIONAL CONFERENCE ON CLOUD AND GREEN COMPUTING / SECOND INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AND ITS APPLICATIONS (CGC/SCA 2012), 2012, : 688 - 694
  • [9] Optimal Coordination Mechanisms for Multi-job Scheduling Games
    Abed, Fidaa
    Correa, Jose R.
    Huang, Chien-Chung
    ALGORITHMS - ESA 2014, 2014, 8737 : 13 - 24
  • [10] Grid Computing Platform for Secure Multi-Job Operation
    Kim, Jiho
    Lee, Minhyun
    Song, Ohyoung
    11TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY, VOLS I-III, PROCEEDINGS,: UBIQUITOUS ICT CONVERGENCE MAKES LIFE BETTER!, 2009, : 432 - 435