The Design Space of Emergent Scheduling for Distributed Execution Frameworks

被引:5
作者
Dean, Paul [1 ]
Porter, Barry [1 ]
机构
[1] Univ Lancaster, Sch Comp & Commun, Lancaster, England
来源
2021 INTERNATIONAL SYMPOSIUM ON SOFTWARE ENGINEERING FOR ADAPTIVE AND SELF-MANAGING SYSTEMS (SEAMS 2021) | 2021年
关键词
D O I
10.1109/SEAMS51251.2021.00032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed Execution Frameworks (DEFs) such as Apache Spark have become ubiquitous as a solution for the execution of user-defined jobs to process terabytes of data across hundreds of nodes. One of the key costs of DEFs is scheduling of which parts of each job are placed on each host; better scheduling decisions provide lower overall execution time for each job, more efficient resource usage, and reduced energy consumption. Existing DEFs use a static approach to scheduling, either with a single generalised scheduler which aims to be a good fit for most workloads, or with a special-purpose scheduler which is tuned to optimise for a particular kind of workload. In both cases the scheduling implementation is fixed at design-time such that the DEF is unable to adjust to the actual characteristics of workloads that arrive at deployment time. In this paper we introduce an emergent scheduler for Distributed Execution Frameworks. This scheduler can be composed and re-composed at runtime from a set of different building blocks, allowing the system to dynamically provide the benefits of differing scheduling policies over time depending on the actual properties of incoming workloads - with improved performance and resource usage. In this paper we present the overall design of our emergent scheduler, we discuss the theoretical design space of different scheduling approaches, and we examine a specific research question to determine the correlation between workload properties and scheduling performance for different scheduler implementations. Our results are based on a real implementation of our emergent DEF running across multiple hosts in a real datacentre, and our implementation is made available as open-source software.
引用
收藏
页码:186 / 195
页数:10
相关论文
共 28 条
[1]  
[Anonymous], 2014, P 11 USENIX C OP SYS
[2]   Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark [J].
Armbrust, Michael ;
Das, Tathagata ;
Torres, Joseph ;
Yavuz, Burak ;
Zhu, Shixiong ;
Xin, Reynold ;
Ghodsi, Ali ;
Stoica, Ion ;
Zaharia, Matei .
SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, :601-613
[3]  
Bardhan Shouvik, 2014, 2014 IEEE International Conference on Big Data (Big Data), P11, DOI 10.1109/BigData.2014.7004439
[4]  
Carbone P., 2015, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, V36, DOI DOI 10.1109/IC2EW.2016.56
[5]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[6]  
Delgado P, 2015, P 2015 USENIX C USEN, P499
[7]   Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters [J].
Delimitrou, Christina ;
Kozyrakis, Christos .
ACM SIGPLAN NOTICES, 2013, 48 (04) :77-88
[8]  
Ghodsi Ali, 2011, Nsdi
[9]  
Gog I, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P99
[10]   Exploiting Heterogeneity for Tail Latency and Energy Efficiency [J].
Haque, Md E. ;
He, Yuxiong ;
Elnikety, Sameh ;
Nguyen, Thu D. ;
Bianchini, Ricardo ;
McKinley, Kathryn S. .
50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :625-638