Advances in Topology-Aware Scheduling in Multidimensional Torus-Based Systems

被引:0
作者
Li, Kangkang [1 ]
Malawski, Maciej [2 ]
Oleksy, Piotr [2 ]
Nabrzyski, Jarek [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
[2] AGH Univ Sci & Technol, Dept Comp Sci, Krakow, Poland
来源
NEW FRONTIERS IN HIGH PERFORMANCE COMPUTING AND BIG DATA | 2017年 / 30卷
基金
美国国家科学基金会;
关键词
Topology-aware; torus-based; job scheduling; job mapping; ALLOCATION; ALGORITHMS; PROCESSOR; STRATEGIES;
D O I
10.3233/978-1-61499-816-7-93
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Communication networks in recent high performance computing machines often have multi-dimensional torus topologies, which influences the way jobs should be scheduled into the system. With the rapid growth of the size of modern HPC system's interconnect, network contention has become a critical issue for the performance of parallel jobs, especially for those which are communicationintensive and not tolerant to inter-job interference. Moreover, to improve the runtime consistency, a contiguous allocation strategy is usually adopted, and each job is allocated a convex prism. However, using this strategy brings in internal and external fragmentation, which can degrade the system utilization. To this end, in this work, we investigate and develop various strategies in topology-aware job scheduling strategies for multidimensional torus-based systems, with the objective of improving job performance and system utilization.
引用
收藏
页码:93 / 118
页数:26
相关论文
共 47 条
[1]  
Ajima Y., 2011, Proceedings of the 2011 IEEE 19th Annual Symposium on High-Performance Interconnects (HOTI 2011), P87, DOI 10.1109/HOTI.2011.21
[2]  
ALELIUNAS R, 1982, IEEE T COMPUT, V31, P907, DOI 10.1109/TC.1982.1676109
[3]  
[Anonymous], 2011, ICS 11, DOI [10.1145/1995896.1995909, DOI 10.1145/1995896.1995909]
[4]  
[Anonymous], 2014, ACM T GRAPHICS TOG, DOI DOI 10.1145/2541533
[5]  
Arunkumar S., 1992, International Journal of High Speed Computing, V4, P289, DOI 10.1142/S0129053392000134
[6]  
Baba T., 1990, Proceedings of Supercomputing '90 (Cat. No.90CH2916-5), P878, DOI 10.1109/SUPERC.1990.130114
[7]  
Bani-mohammad S., 2007, TECHNICAL REPORT
[8]  
Bhatele A., 2011, Encyclopedia of Parallel Com- puting, P2057
[9]  
Bhatele A., 2010, COMPUTER SCI RES TEC
[10]  
Bodini O, 2007, DISCRETE MATH THEOR, V9, P241