Exploring Plan-Based Scheduling for Large-Scale Computing Systems

被引:9
|
作者
Zheng, Xingwu [1 ]
Zhou, Zhou [2 ]
Yang, Xu [2 ]
Lan, Zhiling [2 ]
Wang, Jia [1 ]
机构
[1] IIT, Dept Elect & Comp Engn, Chicago, IL 60616 USA
[2] IIT, Dept Comp Sci, Chicago, IL 60616 USA
来源
2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) | 2016年
关键词
Plan-based scheduling; Simulated Annealing algorithm; Optimization;
D O I
10.1109/CLUSTER.2016.43
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As HPC systems scale toward exascale, it becomes critical to manage the underlying resource more effectively. While almost all existing resource management systems schedule jobs in a queuing fashion and have drawbacks of making isolated scheduling decisions that would compromise system performance even with backfilling, plan-based schedulers have the potential to generate better job schedules by producing an execution plan of all waiting jobs but do not receive enough attention. In this paper, we present a novel plan-based scheduling system that utilizes simulated annealing as the optimization engine to support effective resource management on HPC systems. As demonstrated by extensive trace-based simulations with workload traces collected from a wide range of production supercomputers, in comparison with the queue-based scheduling system using FCFS with EASY backfilling, our plan-based scheduling system can reduce the job wait time by 40%, reduce the job response time by 30%, while slightly improving system utilization at the same time. Moreover, our plan-based system is able to run online by solving the scheduling problem at each scheduling iteration within one second, making it practical for production HPC systems.
引用
收藏
页码:259 / 268
页数:10
相关论文
共 50 条
  • [21] Scheduling aerial resource operations for the extinction of large-scale wildfires?
    Skorin-Kapov, Nina
    Mesaric, Luka
    Garcia, Fernando Pereniguez
    Skorin-Kapov, Lea
    OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2024, 122
  • [22] Learned Unmanned Vehicle Scheduling for Large-Scale Urban Logistics
    Zhang, Mei
    Zeng, Yanli
    Wang, Ke
    Li, Yafei
    Wu, Qingshun
    Xu, Mingliang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (07) : 7933 - 7944
  • [23] On the solution of large-scale mixed integer programming scheduling models
    Velez, Sara
    Merchan, Andres F.
    Maravelias, Christos T.
    CHEMICAL ENGINEERING SCIENCE, 2015, 136 : 139 - 157
  • [24] Hybrid evolutionary algorithm for large-scale project scheduling problems
    Zaman, Forhad
    Elsayed, Saber
    Sarker, Ruhul
    Essam, Daryl
    COMPUTERS & INDUSTRIAL ENGINEERING, 2020, 146
  • [25] HEURISTIC ALGORITHMS FOR JOINT OPTIMIZATION OF UNICAST AND ANYCAST TRAFFIC IN ELASTIC OPTICAL NETWORK- BASED LARGE-SCALE COMPUTING SYSTEMS
    Markowski, Marcin
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2017, 27 (03) : 605 - 622
  • [26] Comparison and analysis of eight scheduling heuristics for the optimization of energy consumption and makespan in large-scale distributed systems
    Lindberg, Peder
    Leingang, James
    Lysaker, Daniel
    Khan, Samee Ullah
    Li, Juan
    JOURNAL OF SUPERCOMPUTING, 2012, 59 (01): : 323 - 360
  • [27] Comparison and analysis of eight scheduling heuristics for the optimization of energy consumption and makespan in large-scale distributed systems
    Peder Lindberg
    James Leingang
    Daniel Lysaker
    Samee Ullah Khan
    Juan Li
    The Journal of Supercomputing, 2012, 59 : 323 - 360
  • [28] ALNS-TS based fast optimization algorithm for large-scale maintenance task scheduling
    Gao X.
    Liu D.
    Tan C.
    Li F.
    Huagong Xuebao/CIESC Journal, 2023, 74 (11): : 4645 - 4655
  • [29] Computing the power profiles for an Airborne Wind Energy system based on large-scale wind data
    Malz, E. C.
    Verendel, V.
    Gros, S.
    RENEWABLE ENERGY, 2020, 162 : 766 - 778
  • [30] On Power-Peak-Aware Scheduling for Large-Scale Shared Clusters
    Jiang, Yuxuan
    Huang, Zhe
    Tsang, Danny H. K.
    IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (02) : 412 - 426