Deep and reinforcement learning for automated task scheduling in large-scale cloud computing systems

被引:89
作者
Rjoub, Gaith [1 ]
Bentahar, Jamal [1 ]
Wahab, Omar Abdel [2 ]
Bataineh, Ahmed Saleh [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Sir George Williams Campus,1455 Maisonneuve Blvd, Montreal, PQ, Canada
[2] Univ Quebec Outaouais, Dept Comp Sci & Engn, Gatineau, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
cloud automation; deep learning; reinforcement learning; task scheduling; RESOURCE-ALLOCATION;
D O I
10.1002/cpe.5919
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cloud computing is undeniably becoming the main computing and storage platform for today's major workloads. From Internet of things and Industry 4.0 workloads to big data analytics and decision-making jobs, cloud systems daily receive a massive number of tasks that need to be simultaneously and efficiently mapped onto the cloud resources. Therefore, deriving an appropriate task scheduling mechanism that can both minimize tasks' execution delay and cloud resources utilization is of prime importance. Recently, the concept of cloud automation has emerged to reduce the manual intervention and improve the resource management in large-scale cloud computing workloads. In this article, we capitalize on this concept and propose four deep and reinforcement learning-based scheduling approaches to automate the process of scheduling large-scale workloads onto cloud computing resources, while reducing both the resource consumption and task waiting time. These approaches are: reinforcement learning (RL), deep Q networks, recurrent neural network long short-term memory (RNN-LSTM), and deep reinforcement learning combined with LSTM (DRL-LSTM). Experiments conducted using real-world datasets from Google Cloud Platform revealed that DRL-LSTM outperforms the other three approaches. The experiments also showed that DRL-LSTM minimizes the CPU usage cost up to67%compared with the shortest job first (SJF), and up to35%compared with both the round robin (RR) and improved particle swarm optimization (PSO) approaches. Moreover, our DRL-LSTM solution decreases the RAM memory usage cost up to72%compared with the SJF, up to65%compared with the RR, and up to31.25%compared with the improved PSO.
引用
收藏
页数:14
相关论文
共 34 条
  • [1] [Anonymous], 2013, Int. J. Appl. Innov. Eng. Manag
  • [2] Applying reinforcement learning towards automating resource allocation and application scalability in the cloud
    Barrett, Enda
    Howley, Enda
    Duggan, Jim
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2013, 25 (12) : 1656 - 1674
  • [3] An intelligent/cognitive model of task scheduling for IoT applications in cloud computing environment
    Basu, Sayantani
    Karuppiah, Marimuthu
    Selvakumar, K.
    Li, Kuan-Ching
    Islam, S. K. Hafizul
    Hassan, Mohammad Mehedi
    Bhuiyan, Md Zakirul Alam
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 88 : 254 - 261
  • [4] Toward monetizing personal data: A two-sided market analysis
    Bataineh, Ahmed Saleh
    Mizouni, Rabeb
    Bentahar, Jamal
    El Barachi, May
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 (111): : 435 - 459
  • [5] Chen HC, 2013, INTELL SYST SER, P1, DOI [10.1155/2013/213234, 10.1007/978-3-642-38868-2_1, 10.1016/B978-0-12-404702-0.00001-X]
  • [6] Task scheduling based on deep reinforcement learning in a cloud manufacturing environment
    Dong, Tingting
    Xue, Fei
    Xiao, Chuangbai
    Li, Juntao
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (11)
  • [7] Load-balancing algorithms in cloud computing: A survey
    Ghomi, Einollah Jafarnejad
    Rahmani, Amir Masoud
    Qader, Nooruldeen Nasih
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2017, 88 : 50 - 71
  • [8] Gomathi B., 2018, International Journal of Business Intelligence and Data Mining, V13, P247
  • [9] Using a multi-agent system and artificial intelligence for monitoring and improving the cloud performance and security
    Grzonka, Daniel
    Jakobik, Agnieszka
    Kolodziej, Joanna
    Pllana, Sabri
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 86 : 1106 - 1117
  • [10] Cloud federation formation using genetic and evolutionary game theoretical models
    Hammoud, Ahmad
    Mourad, Azzam
    Otrok, Hadi
    Wahab, Omar Abdel
    Harmanani, Haidar
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 104 : 92 - 104