Fregata: A Low-Latency and Resource-Efficient Scheduling for Heterogeneous Jobs in Clouds

被引：1

作者：

Liu, Jinwei ^{[1
]}

机构：

[1] Florida A&M Univ, Dept Comp & Informat Sci, Tallahassee, FL 32307 USA

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022) | 2022年

关键词：

scheduling; task dependency; resource utilization; latency; machine learning;

D O I：

10.1109/BigComp54360.2022.00013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An increasing number of large-scale data analytics frameworks move towards larger degrees of parallelism aiming at low-latency guarantees. It is challenging to design a scheduler with low latency and high resource utilization due to task dependency and job heterogeneity. The state-of-the-art schedulers in cloud/datacenters cannot well handle the scheduling of heterogeneous jobs with dependency constraints (e.g., dependency among tasks of a job) for simultaneously achieving low latency and high resource utilization. The key issues lie in the scalability in centralized schedulers, ineffective and inefficient probing and resource sharing in both distributed and hybrid schedulers. To address this challenge, we propose Fregata, a low-latency and resource-efficient scheduling for heterogeneous jobs with constraints (e.g., dependency constraints among tasks of a job) in clouds. Fregata first uses the machine learning algorithm to classify jobs into two categories (high priority jobs and low priority jobs) based on the extracted features. Next, Fregata splits the jobs into tasks and distributes the tasks to the master nodes based on task dependency and the load of master nodes. Then, Fregata utilizes the dependency information of tasks to determine task priority (tasks with more dependent tasks have higher priority), and packs tasks by leveraging the complementary of tasks' requirements on different resource types and task dependency. Finally, the master nodes distribute tasks to workers in the system based on priority of tasks and workers and the resource demands of tasks and the available resources of workers. To test the performance of Fregata, we conduct tracedriven experiments. Extensive experimental results based on a real cluster and Amazon EC2 cloud service show that Fregata achieves low-latency and high resource utilization compared to existing schedulers.

引用

页码：15 / 22

页数：8

共 50 条

[1] HCoop: A Cooperative and Hybrid Resource Scheduling for Heterogeneous Jobs in Clouds
Liu, Jinwei
Gong, Rui
Dai, Wei
Zheng, Wei
Mao, Ying
Zhou, Wei
Deng, Feng
2023 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE, CLOUDCOM 2023, 2023, : 238 - 245
[2] Low-Latency Scheduling in MPTCP
Hurtig, Per
Grinnemo, Karl-Johan
Brunstrom, Anna
Ferlin, Simone
Alay, Ozgu
Kuhn, Nicolas
IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (01) : 302 - 315
[3] Low-Latency and Reliable Virtual Network Function Placement in Edge Clouds
Ben Haim, Roi
Rottenstreich, Ori
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2023, 20 (03): : 2172 - 2185
[4] Resource Allocation for Low-Latency Vehicular Communications with Packet Retransmission
Guo, Chongtao
Liang, Le
Li, Geoffrey Ye
2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
[5] Towards Low-Latency Batched Stream Processing by Pre-Scheduling
Jin, Hai
Chen, Fei
Wu, Song
Yao, Yin
Liu, Zhiyi
Gu, Lin
Zhou, Yongluan
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (03) : 710 - 722
[6] A Deterministic Scheduling Policy for Low-Latency Wireless Communication With Continuous Channel States
Wu, Junjie
Chen, Wei
IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (10) : 6590 - 6603
[7] Paella: Low-latency Model Serving with Software-defined GPU Scheduling
Ng, Kelvin K. W.
Demoulin, Henri Maxime
Liu, Vincent
PROCEEDINGS OF THE TWENTY-NINTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2023, 2023, : 595 - 610
[8] Lever: Towards Low-Latency Batched Stream Processing by Pre-Scheduling
Chen, Fei
Wu, Song
Jin, Hai
Yao, Yin
Liu, Zhiyi
Gu, Lin
Zhou, Yongluan
PROCEEDINGS OF THE 2017 SYMPOSIUM ON CLOUD COMPUTING (SOCC '17), 2017, : 643 - 643
[9] Low-latency guaranteed-rate scheduling using Elastic Round Robin
Kanhere, SS
Sethu, H
COMPUTER COMMUNICATIONS, 2002, 25 (14) : 1315 - 1322
[10] FEDERATED-LEARNING-BASED CLIENT SCHEDULING FOR LOW-LATENCY WIRELESS COMMUNICATIONS
Xia, Wenchao
Wen, Wanli
Wong, Kai-Kit
Quek, Tony Q. S.
Zhang, Jun
Zhu, Hongbo
IEEE WIRELESS COMMUNICATIONS, 2021, 28 (02) : 32 - 38

← 1 2 3 4 5 →