GOLGI: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing

被引：11

作者：

Li, Suyi ^{[1
]}

Wang, Wei ^{[1
]}

Yang, Jun ^{[2
]}

Chen, Guangzhen ^{[2
]}

Lu, Daohe ^{[2
]}

机构：

[1] HKUST, Hong Kong, Peoples R China

[2] WeBank, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON CLOUD COMPUTING, SOCC 2023 | 2023年

关键词：

Serverless Computing; Resource Management; Scheduling;

D O I：

10.1145/3620678.3624645

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces GOLGI, a novel scheduling system designed for serverless functions, with the goal of minimizing resource provisioning costs while meeting the function latency requirements. To achieve this, GOLGI judiciously over-commits functions based on their past resource usage. To ensure overcommitment does not cause significant performance degradation, GOLGI identifies nine low-level metrics to capture the runtime performance of functions, encompassing factors like request load, resource allocation, and contention on shared resources. These metrics enable accurate prediction of function performance using the Mondrian Forest, a classification model that is continuously updated in real-time for optimal accuracy without extensive offline training. GOLGI employs a conservative exploration-exploitation strategy for request routing. By default, it routes requests to non-overcommitted instances to ensure satisfactory performance. However, it actively explores opportunities for using more resource-efficient overcommitted instances, while maintaining the specified latency SLOs. Golgi also performs vertical scaling to dynamically adjust the concurrency of overcommitted instances, maximizing request throughput and enhancing system robustness to prediction errors. We have prototyped GOLGI and evaluated it in both EC2 cluster and a small production cluster. The results show that GOLGI can meet the SLOs while reducing the resource provisioning cost by 42% (30%) in EC2 cluster (our production cluster).

引用

页码：32 / 47

页数：16

共 48 条

[1]

Agache A, 2020, PROCEEDINGS OF THE 17TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, P419

[2]

Alibaba Cloud, 2022, Aliyun Function Compute Pricing

[3]

Amvrosiadis G, 2018, PROCEEDINGS OF THE 2018 USENIX ANNUAL TECHNICAL CONFERENCE, P533

[4]

AWS Lambda, 2022, How do I request a concurrency limit increase for my Lambda function?

[5]

AWS Lambda, 2022, AWS Lambda Pricing

[6]

AWS Lambda, 2022, Lambda function scaling

[7]

Balaji Bharathan., 2020, P MLSYS 2021 NEURIPS

[8] Take it to the Limit: Peak Prediction-driven Resource Overcommitment in Datacenters [J].

Bashir, Noman ;

Deng, Nan ;

Rzadca, Krzysztof ;

Irwin, David ;

Kodak, Sree ;

Jnagal, Rohit .

PROCEEDINGS OF THE SIXTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS '21), 2021, :556-573

[9] Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms [J].

Cortez, Eli ;

Bonde, Anand ;

Muzio, Alexandre ;

Russinovich, Mark ;

Fontoura, Marcus ;

Bianchini, Ricardo .

PROCEEDINGS OF THE TWENTY-SIXTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '17), 2017, :153-167

[10]

Ellis Alex., 2022, OpenFaaS: Server Functions, Made Simple

← 1 2 3 4 5 →