Exploring Plan-Based Scheduling for Large-Scale Computing Systems

被引:9
|
作者
Zheng, Xingwu [1 ]
Zhou, Zhou [2 ]
Yang, Xu [2 ]
Lan, Zhiling [2 ]
Wang, Jia [1 ]
机构
[1] IIT, Dept Elect & Comp Engn, Chicago, IL 60616 USA
[2] IIT, Dept Comp Sci, Chicago, IL 60616 USA
来源
2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) | 2016年
关键词
Plan-based scheduling; Simulated Annealing algorithm; Optimization;
D O I
10.1109/CLUSTER.2016.43
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As HPC systems scale toward exascale, it becomes critical to manage the underlying resource more effectively. While almost all existing resource management systems schedule jobs in a queuing fashion and have drawbacks of making isolated scheduling decisions that would compromise system performance even with backfilling, plan-based schedulers have the potential to generate better job schedules by producing an execution plan of all waiting jobs but do not receive enough attention. In this paper, we present a novel plan-based scheduling system that utilizes simulated annealing as the optimization engine to support effective resource management on HPC systems. As demonstrated by extensive trace-based simulations with workload traces collected from a wide range of production supercomputers, in comparison with the queue-based scheduling system using FCFS with EASY backfilling, our plan-based scheduling system can reduce the job wait time by 40%, reduce the job response time by 30%, while slightly improving system utilization at the same time. Moreover, our plan-based system is able to run online by solving the scheduling problem at each scheduling iteration within one second, making it practical for production HPC systems.
引用
收藏
页码:259 / 268
页数:10
相关论文
共 50 条
  • [31] Large-scale emergency medical services scheduling during the outbreak of epidemics
    Wang, Lubing
    Zhao, Xufeng
    Wu, Peng
    ANNALS OF OPERATIONS RESEARCH, 2023,
  • [32] Practical large-scale coordinated scheduling in LTE-Advanced networks
    Nardini, Giovanni
    Stea, Giovanni
    Virdis, Antonio
    Sabella, Dario
    Caretti, Marco
    WIRELESS NETWORKS, 2016, 22 (01) : 11 - 31
  • [33] Integration of Planning and Scheduling for Large-Scale Multijob Multitasking Batch Plants
    Menon, Kavitha G.
    Fukasawa, Ricardo
    Ricardez-Sandoval, Luis A.
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2024, 63 (02) : 1039 - 1054
  • [34] Optimal Scheduling of In-situ Analysis for Large-scale Scientific Simulations
    Malakar, Preeti
    Vishwanath, Venkatram
    Munson, Todd
    Knight, Christopher
    Hereld, Mark
    Leyffer, Sven
    Papka, Michael E.
    PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2015,
  • [35] Joint Deployment and Task Scheduling Optimization for Large-Scale Mobile Users in Multi-UAV-Enabled Mobile Edge Computing
    Wang, Yong
    Ru, Zhi-Yang
    Wang, Kezhi
    Huang, Pei-Qiu
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (09) : 3984 - 3997
  • [36] Enabling Efficient Scheduling in Large-Scale UAV-Assisted Mobile-Edge Computing via Hierarchical Reinforcement Learning
    Ren, Tao
    Niu, Jianwei
    Dai, Bin
    Liu, Xuefeng
    Hu, Zheyuan
    Xu, Mingliang
    Guizani, Mohsen
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (10) : 7095 - 7109
  • [37] Large-Scale Vehicle Platooning: Advances and Challenges in Scheduling and Planning Techniques
    Hou, Jing
    Chen, Guang
    Huang, Jin
    Qiao, Yingjun
    Xiong, Lu
    Wen, Fuxi
    Knoll, Alois
    Jiang, Changjun
    ENGINEERING, 2023, 28 : 26 - 48
  • [38] Practical large-scale coordinated scheduling in LTE-Advanced networks
    Giovanni Nardini
    Giovanni Stea
    Antonio Virdis
    Dario Sabella
    Marco Caretti
    Wireless Networks, 2016, 22 : 11 - 31
  • [39] Exploiting Modern Computing Architectures for Efficient Large-Scale Nonlinear Programming
    Zhu, Yu
    Word, Daniel
    Siirola, John
    Laird, Carl D.
    10TH INTERNATIONAL SYMPOSIUM ON PROCESS SYSTEMS ENGINEERING, 2009, 27 : 783 - 788
  • [40] A Novel Heap-Based Optimizer for Scheduling of Large-Scale Combined Heat and Power Economic Dispatch
    Ginidi, Ahmed R.
    Elsayed, Abdallah M.
    Shaheen, Abdullah M.
    Elattar, Ehab E.
    El-Sehiemy, Ragab A.
    IEEE ACCESS, 2021, 9 : 83695 - 83708