Rover: An Online Spark SQL Tuning Service via Generalized Transfer Learning

被引：7

作者：

Shen, Yu ^{[1
,2
]}

Ren, Xinyuyang ^{[2
]}

Lu, Yupeng ^{[1
,2
]}

Jiang, Huaijun ^{[2
,3
]}

Xu, Huanyong ^{[2
]}

Peng, Di ^{[2
]}

Li, Yang ^{[1
]}

Zhang, Wentao ^{[4
]}

Cui, Bin ^{[5
]}

机构：

[1] Peking Univ, Sch CS, Beijing, Peoples R China

[2] ByteDance Inc, Beijing, Peoples R China

[3] Peking Univ, Ctr Data Sci, Beijing, Peoples R China

[4] Mila Quebec AI Inst, Montreal, PQ, Canada

[5] Peking Univ, Inst Computat Social Sci, Sch CS, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

关键词：

Spark SQL; Bayesian Optimization; Transfer Learning; MAP-REDUCE; SYSTEM;

D O I：

10.1145/3580305.3599953

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distributed data analytic engines like Spark are common choices to process massive data in industry. However, the performance of Spark SQL highly depends on the choice of configurations, where the optimal ones vary with the executed workloads. Among various alternatives for Spark SQL tuning, Bayesian optimization (BO) is a popular framework that finds near-optimal configurations given sufficient budget, but it suffers from the re-optimization issue and is not practical in real production. When applying transfer learning to accelerate the tuning process, we notice two domain-specific challenges: 1) most previous work focus on transferring tuning history, while expert knowledge from Spark engineers is of great potential to improve the tuning performance but is not well studied so far; 2) history tasks should be carefully utilized, where using dissimilar ones lead to a deteriorated performance in production. In this paper, we present Rover, a deployed online Spark SQL tuning service for efficient and safe search on industrial workloads. To address the challenges, we propose generalized transfer learning to boost the tuning performance based on external knowledge, including expert-assisted Bayesian optimization and controlled history transfer. Experiments on public benchmarks and real-world tasks show the superiority of Rover over competitive baselines. Notably, Rover saves an average of 50.1% of the memory cost on 12k real-world Spark SQL tasks in 20 iterations, among which 76.2% of the tasks achieve a significant memory reduction of over 60%.

引用

页码：4800 / 4812

页数：13

共 63 条

[1]

Alipourfard O, 2017, PROCEEDINGS OF NSDI '17: 14TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, P469

[2]

[Anonymous], 13 USENIX S NETW SYS

[3]

[Anonymous], 2015, DATA MINING ICDM 201, DOI DOI 10.1109/ICDM.2015.20

[4]

Apache Spark Tuning, 2017, AP SPARK TUN DZONE

[5] Spark SQL: Relational Data Processing in Spark [J].

Armbrust, Michael ;

Xin, Reynold S. ;

Lian, Cheng ;

Huai, Yin ;

Liu, Davies ;

Bradley, Joseph K. ;

Meng, Xiangrui ;

Kaftan, Tomer ;

Franklint, Michael J. ;

Ghodsi, Ali ;

Zaharia, Matei .

SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, :1383-1394

[6]

Babu Shivnath, 2020, PROCEEDINGS OF THE 2, P1667, DOI DOI 10.1145/3318464.3380591

[7]

Bai Tianyi, 2023, ARXIV230205927

[8]

Bao L, 2018, IEEE INT CONF BIG DA, P181, DOI 10.1109/BigData.2018.8622018

[9] RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop's Configuration [J].

Bei, Zhendong ;

Yu, Zhibin ;

Zhang, Huiling ;

Xiong, Wen ;

Xu, Chengzhong ;

Eeckhout, Lieven ;

Feng, Shengzhong .

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (05) :1470-1483

[10]

Bergstra J.S., 2011, ADV NEURAL INFORM PR

← 1 2 3 4 5 6 7 →