Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters

被引:99
作者
Delimitrou, Christina [1 ]
Sanchez, Daniel [2 ]
Kozyrakis, Christos [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] MIT, Cambridge, MA 02139 USA
来源
ACM SoCC'15: Proceedings of the Sixth ACM Symposium on Cloud Computing | 2015年
关键词
Cloud computing; datacenters; scheduling; QoS; resource-efficiency; scalability;
D O I
10.1145/2806777.2806779
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Scheduling diverse applications in large, shared clusters is particularly challenging. Recent research on cluster scheduling focuses either on scheduling speed, using sampling to quickly assign resources to tasks, or on scheduling quality, using centralized algorithms that search for the resources that improve both task performance and cluster utilization. We present Tarcil, a distributed scheduler that targets both scheduling speed and quality. Tarcil uses an analytically derived sampling framework that adjusts the sample size based on load, and provides statistical guarantees on the quality of allocated resources. It also implements admission control when sampling is unlikely to find suitable resources. This makes it appropriate for large, shared clusters hosting short-and long-running jobs. We evaluate Tarcil on clusters with hundreds of servers on EC2. For highly-loaded clusters running short jobs, Tarcil improves task execution time by 41% over a distributed, sampling-based scheduler. For more general scenarios, Tarcil achieves near-optimal performance for 4x and 2x more jobs than sampling-based and centralized schedulers respectively.
引用
收藏
页码:97 / 110
页数:14
相关论文
共 37 条
  • [1] Andoni A., 2006, P 47 ANN IEEE S FDN
  • [2] [Anonymous], P USENIX ANN TECHN C
  • [3] [Anonymous], 2014, P 19 INT C ARCH SUPP
  • [4] [Anonymous], 2008, Cost of power in large-scale data centers
  • [5] [Anonymous], 2011, NSDI
  • [6] [Anonymous], 2011, Mining of Massive Datasets
  • [7] [Anonymous], 2013, P 18 INT C ARCH SUPP
  • [8] [Anonymous], P 22 S OP SYST PRINC
  • [9] [Anonymous], 2012, P 2012 ACM SIGMOD IN
  • [10] [Anonymous], STOC 2002