Scheduling parallel jobs on multicore clusters using CPU oversubscription

被引:8
作者
Utrera, Gladys [1 ]
Corbalan, Julita [1 ]
Labarta, Jesus [2 ]
机构
[1] Univ Politecn Cataluna, Comp Architecture Dept, BarcelonaTech, ES-08034 Barcelona, Spain
[2] Barcelona Supercomp Ctr, Barcelona 08034, Spain
关键词
Job scheduling; MPI; Malleability; Application reconfiguration; Multicore clusters; CPU oversubscription; PERFORMANCE;
D O I
10.1007/s11227-014-1142-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Job scheduling strategies in multiprocessing systems aim to minimize waiting times of jobs while satisfying user requirements in terms of number of execution units. However, the lack of flexibility in the requests leaves the scheduler a reduced margin of action for scheduling decisions. Many of such decisions consist on just moving ahead some specific jobs in the wait queue. In this work, we propose a job scheduling strategy that improves the overall performance and maximizes resource utilization by allowing jobs to adapt to variations in the load through CPU oversubscription and backfilling. The experimental evaluations include both real executions on multicore clusters and simulations of workload traces from real production systems. The results show that our strategy provides significant improvements over previous proposals like Gang Scheduling with Backfilling, especially in medium to high workloads with strong variations.
引用
收藏
页码:1113 / 1140
页数:28
相关论文
共 29 条
  • [1] [Anonymous], 2010, 2010 IEEE INT S PAR, DOI DOI 10.1109/IPDPS.2010.5470434
  • [2] [Anonymous], 2013, TOP500 SUPERCOMPUTER
  • [3] [Anonymous], 1997, TECHNICAL REPORT
  • [4] Implicit coscheduling: Coordinated scheduling with implicit information in distributed systems
    Arpaci-Dusseau, AC
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2001, 19 (03): : 283 - 331
  • [5] Buisson J, 2007, 2007 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, P372, DOI 10.1109/CLUSTR.2007.4629252
  • [6] Cera MC, 2010, LECT NOTES COMPUT SC, V5935, P242, DOI 10.1007/978-3-642-11322-2_26
  • [7] Using moldability to improve the performance of supercomputer jobs
    Cirne, W
    Berman, F
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2002, 62 (10) : 1571 - 1601
  • [8] El Maghraoui K, 2007, CCGRID 2007: SEVENTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, P591
  • [9] Feitelson D. G., 1996, Job Scheduling Strategies for Parallel Processing. IPPS '96 Workshop Proceedings, P1, DOI 10.1007/BFb0022284
  • [10] GANG SCHEDULING PERFORMANCE BENEFITS FOR FINE-GRAIN SYNCHRONIZATION
    FEITELSON, DG
    RUDOLPH, L
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1992, 16 (04) : 306 - 318