Robustness challenges in Reinforcement Learning based time-critical cloud resource scheduling: A Meta-Learning based solution

被引：7

作者：

Liu, Hongyun ^{[1
,2
]}

Chen, Peng ^{[3
]}

Ouyang, Xue ^{[4
]}

Gao, Hui ^{[5
]}

Yan, Bing ^{[6
]}

Grosso, Paola ^{[1
]}

Zhao, Zhiming ^{[1
]}

机构：

[1] Univ Amsterdam, Informat Inst, NL-1098 XH Amsterdam, Netherlands

[2] Univ Amsterdam, Grad Sch Informat, NL-1098 XH Amsterdam, Netherlands

[3] Xihua Univ, Sch Comp & Software Engn, Chengdu 610039, Peoples R China

[4] Natl Univ Def Technol, Sch Comp Sci, Changsha 410073, Peoples R China

[5] Shaanxi Univ Sci & Technol, Coll Elect & Control Engn, Xian 710021, Peoples R China

[6] Univ Adelaide, Sch Elect & Elect Engn, Adelaide, SA 5005, Australia

来源：

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2023年 / 146卷

基金：

中国国家自然科学基金;

关键词：

Robustness; Reinforcement Learning; Meta Learning; Resource management; Task scheduling; Cloud computing; MANAGEMENT;

D O I：

10.1016/j.future.2023.03.029

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Cloud computing attracts increasing attention in processing dynamic computing tasks and automating the software development and operation pipeline. In many cases, the computing tasks have strict deadlines. The cloud resource manager (e.g., orchestrator) effectively manages the resources and provides tasks Quality of Service (QoS). Cloud task scheduling is tricky due to the dynamic nature of task workload and resource availability. Reinforcement Learning (RL) has attracted lots of research attention in scheduling. However, those RL-based approaches suffer from low scheduling performance robustness when the task workload and resource availability change, particularly when handling timecritical tasks. This paper focuses on both challenges of robustness and deadline guarantee among such RL, specifically Deep RL (DRL)-based scheduling approaches. We quantify the robustness measurements as the retraining time and investigate how to improve both robustness and deadline guarantee of DRL-based scheduling. We propose MLR-TC-DRLS, a practical, robust Meta Deep Reinforcement Learning-based scheduling solution to provide time-critical tasks deadline guarantee and fast adaptation under highly dynamic situations. We comprehensively evaluate MLR-TC-DRLS performance against RL-based and RL advanced variants-based scheduling approaches using real-world and synthetic data. The evaluations validate that our proposed approach improves the scheduling performance robustness of typical DRL variants scheduling approaches with 97%-98.5% deadline guarantees and 200%-500% faster adaptation.

引用

页码：18 / 33

页数：16

共 50 条

[21] Learn to chill - Intelligent Chiller Scheduling using Meta-learning and Deep Reinforcement Learning
Manoharan, Praveen
Venkat, Malini Pooni
Nagarathinam, Srinarayana
Vasan, Arunchandar
BUILDSYS'21: PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILT ENVIRONMENTS, 2021, : 21 - 30
[22] Pricing Cloud Resource Based on Reinforcement Learning in the Competing Environment
Shi, Bing
Zhu, Hangxing
Yuan, Han
Shi, Rongjian
Wang, Jinwen
CLOUD COMPUTING - CLOUD 2018, 2018, 10967 : 158 - 171
[23] Intelligent task scheduling strategy for cloud robot based on parallel reinforcement learning
Xue F.
Su Q.
International Journal of Wireless and Mobile Computing, 2019, 17 (03): : 293 - 299
[24] PSO-Based Ensemble Meta-Learning Approach for Cloud Virtual Machine Resource Usage Prediction
Leka, Habte Lejebo
Fengli, Zhang
Kenea, Ayantu Tesfaye
Hundera, Negalign Wake
Tohye, Tewodros Gizaw
Tegene, Abebe Tamrat
SYMMETRY-BASEL, 2023, 15 (03):
[25] UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning
Li, Bo
Gan, Zhigang
Chen, Daqing
Sergey Aleksandrovich, Dyachenko
REMOTE SENSING, 2020, 12 (22) : 1 - 20
[26] Deep reinforcement learning-based methods for resource scheduling in cloud computing: a review and future directions
Zhou, Guangyao
Tian, Wenhong
Buyya, Rajkumar
Xue, Ruini
Song, Liang
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (05)
[27] A Reinforcement Learning Scheduling Strategy for Parallel Cloud-based Workflows
Nascimento, Andre
Olimpio, Victor
Silva, Vitor
Paes, Aline
de Oliveira, Daniel
2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 817 - 824
[28] Task scheduling based on deep reinforcement learning in a cloud manufacturing environment
Dong, Tingting
Xue, Fei
Xiao, Chuangbai
Li, Juntao
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (11)
[29] Dejavu: Reinforcement Learning-based Cloud Scheduling with Demonstration and Competition
Kim, Seonwoo
Nam, Yoonsung
Park, Minwoo
Lee, Heewon
Kim, Seyeon
Ha, Sangtae
2024 IEEE 21ST INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SMART SYSTEMS, MASS 2024, 2024, : 469 - 478
[30] Curriculum-Based Meta-learning
Zhang, Ji
Song, Jingkuan
Yao, Yazhou
Gao, Lianli
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1838 - 1846

← 1 2 3 4 5 →