Fast and Lightweight Execution Time Predictions for Spark Applications

被引:7
作者
Amannejad, Yasaman [1 ]
Shah, Sarah [2 ]
Krishnamurthy, Diwakar [2 ]
Wang, Mea [3 ]
机构
[1] Mt Royal Univ, Math & Comp, Calgary, AB, Canada
[2] Univ Calgary, Elect & Comp Engn, Calgary, AB, Canada
[3] Univ Calgary, Comp Sci, Calgary, AB, Canada
来源
2019 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (IEEE CLOUD 2019) | 2019年
关键词
Apache Spark; Big Data Processing; Performance Prediction; Performance Engineering; Scalability;
D O I
10.1109/CLOUD.2019.00088
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Users and operators of cloud-based Spark clusters often require quick insights on how the execution time of an application is likely to be impacted by the resources allocated to the application, e.g., the number of Spark executor cores assigned, and the size of the data to be processed. Existing techniques typically require extensive prior executions of the application under various resource allocation settings and data sizes to obtain an accurate model. In this paper, we explore the accuracy of a model with less prior executions of the application. Such a model can be useful for situations where quick predictions are required and little cluster resources are available for building a model. We use logs from two executions of an application with small sample data and different resource settings and explore the accuracy of the predictions for other resource allocation settings and input data sizes.
引用
收藏
页码:493 / 495
页数:3
相关论文
共 9 条
[1]  
Amdahl G.M., 1967, AFIPS CONF P, P483, DOI 10.1145/1465482.1465560
[2]  
Arora A., SCALABLE MATRIX MULT
[3]  
Gibilisco GP, 2016, IEEE INT CONF CLOUD, P188, DOI [10.1109/CLOUD.2016.0034, 10.1109/CLOUD.2016.32]
[4]   Dynamic Configuration of Partitioning in Spark Applications [J].
Gounaris, Anastasios ;
Kougka, Georgia ;
Tous, Ruben ;
Montes, Carlos Tripiana ;
Torres, Jordi .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (07) :1891-1904
[5]  
Islam M. T., 2017, DSPARK DEADLINE BASE
[6]  
Venkataraman S, 2016, 13TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION (NSDI '16), P363
[7]  
Wang GL, 2016, PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS
[8]  
IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY
[9]  
IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), P586, DOI [10.1109/HPCC-SmartCity-DSS.2016.45, 10.1109/HPCC-SmartCity-DSS.2016.0088]