Communication-aware Job Scheduling using SLURM

被引:2
|
作者
Mishra, Priya [1 ]
Agrawal, Tushar [1 ]
Malakar, Preeti [1 ]
机构
[1] Indian Inst Technol Kanpur, Kanpur, Uttar Pradesh, India
来源
49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOP PROCEEDINGS, ICPP 2020 | 2020年
关键词
job scheduling; communication-aware; job-aware; SLURM; PERFORMANCE; OPERATIONS;
D O I
10.1145/3409390.3409410
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Job schedulers play an important role in selecting optimal resources for the submitted jobs. However, most of the current job schedulers do not consider job-specific characteristics such as communication patterns during resource allocation. This often leads to sub-optimal node allocations. We propose three node allocation algorithms that consider the job's communication behavior to improve the performance of communication-intensive jobs. We develop our algorithms for tree-based network topologies. The proposed algorithms aim at minimizing network contention by allocating nodes on the least contended switches. We also show that allocating nodes in powers of two leads to a decrease in inter-switch communication for MPI communications, which further improves performance. We implement and evaluate our algorithms using SLURM, a widely-used and well-known job scheduler. We show that the proposed algorithms can reduce the execution times of communication-intensive jobs by 9% (326 hours) on average. The average wait time of jobs is reduced by 31% across three supercomputer job logs.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Communication-Aware Scheduling for Malleable Tasks
    Shimada, Kana
    Taniguchi, Ittetsu
    Tomiyama, Hiroyuki
    2019 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2019, : 11 - 16
  • [3] Communication-Aware Affinity Scheduling Heuristics in Multicore Systems
    Regueira, Diego
    Iturriaga, Santiago
    Nesmachnow, Sergio
    HIGH PERFORMANCE COMPUTING CARLA 2016, 2017, 697 : 33 - 48
  • [4] Communication-Aware Scheduling of Serial Tasks for Dispersed Computing
    Yang, Chien-Sheng
    Pedarsani, Ramtin
    Avestimehr, A. Salman
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (04) : 1330 - 1343
  • [5] A communication-aware task scheduling algorithm for heterogeneous systems
    Lai, GJ
    14TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2003, : 161 - 166
  • [6] Communication-Aware Scheduling of Serial Tasks for Dispersed Computing
    Yang, Chien-Sheng
    Pedarsani, Ramtin
    Avestimehr, A. Salman
    2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 1226 - 1230
  • [7] Scheduling communication-aware tasks on distributed heterogeneous computing systems
    Lai, GJ
    24TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS, PROCEEDINGS, 2004, : 852 - 857
  • [8] On the design of communication-aware task scheduling strategies for heterogeneous systems
    Orduña, JM
    Arnau, V
    Ruiz, A
    Valero, R
    Duato, J
    2000 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS, 2000, : 391 - 398
  • [9] Communication-aware scheduling algorithm based on heterogeneous computing systems
    Ruan, Youlin
    Liu, Gan
    Han, Jianjun
    Li, Qinghua
    COMPUTATIONAL SCIENCE - ICCS 2007, PT 1, PROCEEDINGS, 2007, 4487 : 426 - +
  • [10] Towards a communication-aware task scheduling strategy for heterogeneous systems
    Orduña, JM
    Silla, F
    Duato, J
    COMPUTING AND INFORMATICS, 2001, 20 (03) : 245 - 267