TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters

被引:122
作者
Tumanov, Alexey [1 ]
Zhu, Timothy [1 ]
Park, Jun Woo [1 ]
Kozuch, Michael A. [2 ]
Harchol-Balter, Mor [1 ]
Ganger, Gregory R. [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Intel Labs, Santa Clara, CA USA
来源
PROCEEDINGS OF THE ELEVENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, (EUROSYS 2016) | 2016年
关键词
D O I
10.1145/2901318.2901355
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
TetriSched is a scheduler that works in tandem with a calendaring reservation system to continuously re-evaluate the immediate-term scheduling plan for all pending jobs (including those with reservations and best-effort jobs) on each scheduling cycle. TetriSched leverages information supplied by the reservation system about jobs' deadlines and estimated runtimes to plan ahead in deciding whether to wait for a busy preferred resource type (e.g., machine with a GPU) or fall back to less preferred placement options. Plan-ahead affords significant flexibility in handling mis-estimates in job runtimes specified at reservation time. Integrated with the main reservation system in Hadoop YARN, TetriSched is experimentally shown to achieve significantly higher SLO attainment and cluster utilization than the best-configured YARN reservation and CapacityScheduler stack deployed on a real 256 node cluster.
引用
收藏
页数:16
相关论文
共 39 条
[1]  
Amiri Khalil., 2000, Proceedings of the annual conference on USENIX Annual Technical Conference, ATEC '00, P25
[2]  
[Anonymous], MASCOTS
[3]  
[Anonymous], 2011, NSDI
[4]  
[Anonymous], 1994, P 1 USENIX C OP SYST
[5]  
[Anonymous], 2012, ISTCCCTR12101
[6]  
[Anonymous], 2011, P 8 USENIX S NETW SY
[7]  
[Anonymous], 2013, P 40 ANN INT S COMP
[8]  
Boutin E, 2014, OSDI, V14, P285
[9]  
Chen Yanpei, 2012, P VLDB ENDOWMENT
[10]  
CHUN B.-G., 2012, SEMINAR