Multi-job Merging Framework and Scheduling Optimization for Apache Flink

被引:0
|
作者
Ji, Hangxu [1 ]
Wu, Gang [1 ]
Zhao, Yuhai [1 ]
Yuan, Ye [2 ]
Wang, Guoren [2 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang, Peoples R China
[2] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Multi-job merging; Scheduling optimization; Distributed computing; Flink; SPARK;
D O I
10.1007/978-3-030-73194-6_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the popularization of big data technology, distributed computing systems are constantly evolving and maturing, making substantial contributions to the query and analysis of massive data. However, the insufficient utilization of system resources is an inherent problem of distributed computing engines. Particularly, when more jobs lead to execution blocking, the system schedules multiple jobs on a first-come-first-executed (FCFE) basis, even if there are still many remaining resources in the cluster. Therefore, the optimization of resource utilization is key to improving the efficiency of multi-job execution. We investigated the field of multi-job execution optimization, designed a multi-job merging framework and scheduling optimization algorithm, and implemented them in the latest generation of a distributed computing system, Apache Flink. In summary, the advantages of our work are highlighted as follows: (1) the framework enables Flink to support multi-job collection, merging and dynamic tuning of the execution sequence, and the selection of these functions are customizable. (2) with the multi-job merging and optimization, the total running time can be reduced by 31% compared with traditional sequential execution. (3) the multi-job scheduling optimization algorithm can bring 28% performance improvement, and in the average case can reduce the cluster idle resources by 61%.
引用
收藏
页码:20 / 36
页数:17
相关论文
共 50 条
  • [1] Optimal Coordination Mechanisms for Multi-job Scheduling Games
    Abed, Fidaa
    Correa, Jose R.
    Huang, Chien-Chung
    ALGORITHMS - ESA 2014, 2014, 8737 : 13 - 24
  • [2] Improving Multi-Job MapReduce Scheduling in an Opportunistic Environment
    Ji, Yuting
    Tong, Lang
    He, Ting
    Tan, Jian
    Lee, Kang-won
    Zhang, Li
    2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 9 - 16
  • [3] Efficient Device Scheduling with Multi-Job Federated Learning
    Zhou, Chendi
    Liu, Ji
    Jia, Juncheng
    Zhou, Jingbo
    Zhou, Yang
    Dai, Huaiyu
    Dou, Dejing
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9971 - 9979
  • [4] Efficient multi-job federated learning scheduling with fault tolerance
    Fu, Boqian
    Chen, Fahao
    Pan, Shengli
    Li, Peng
    Su, Zhou
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2025, 18 (02)
  • [5] Design of Multi-job Controlling Mechanism in Customer Order Planning and Scheduling
    Zhang, Xiang
    Wang, Wei
    Ye, Chen
    Wang, Guoxin
    2009 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-4, 2009, : 1714 - +
  • [6] Multi-Job Intelligent Scheduling With Cross-Device Federated Learning
    Liu, Ji
    Jia, Juncheng
    Ma, Beichen
    Zhou, Chendi
    Zhou, Jingbo
    Zhou, Yang
    Dai, Huaiyu
    Dou, Dejing
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (02) : 535 - 551
  • [7] MULTI OBJECTIVE BEE COLONY OPTIMIZATION FRAMEWORK FOR GRID JOB SCHEDULING
    Alyaseri, Sana
    Ku-Mahamud, Ku Ruhana
    COMPUTING & INFORMATICS, 4TH INTERNATIONAL CONFERENCE, 2013, 2013, : 58 - +
  • [8] Multi-job Hadoop scheduling to process Geo-distributed big data
    Cavallo, Marco
    Di Modica, Giuseppe
    Polito, Carmelo
    Tomarchio, Orazio
    2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2017, : 1175 - 1181
  • [9] Merge or Separate? Multi-job Scheduling for OpenCL Kernels on CPU/GPU Platforms
    Wen, Yuan
    O'Boyle, Michael F. P.
    PROCEEDINGS OF THE GENERAL PURPOSE GPUS (GPGPU-10), 2017, : 22 - 31
  • [10] An energy-aware bi-level optimization model for multi-job scheduling problems under cloud computing
    Wang, Xiaoli
    Wang, Yuping
    Cui, Yue
    SOFT COMPUTING, 2016, 20 (01) : 303 - 317